chore(ops): 新增 RLS preflight 與 registry certbot 修復包
All checks were successful
Code Review / ai-code-review (push) Successful in 13s

This commit is contained in:
Your Name
2026-05-12 18:25:53 +08:00
parent a18e2f9c3f
commit 0bc1878778
6 changed files with 752 additions and 0 deletions

View File

@@ -1,3 +1,56 @@
## 2026-05-12 | RLS Preflight 與 188 Registry Certbot 修復包
**背景**Wave 1 已確認 production RLS 是 P0但不可直接熱開188 `registry.wooo.work` certbot 也已確認失效,但目前 `ollama` SSH 帳號沒有免密 sudo。這輪把兩個紅燈轉成可重跑、可交接、可審批的 remediation 前置包。
**新增 RLS preflight**
- `scripts/ops/awooop_rls_preflight.py`
- 設計為在 production API pod 內執行,使用 pod-local `DATABASE_URL`,不輸出 DB URL 或密碼。
- read-only 檢查 DB role、`set_config('app.project_id')`、target table `project_id` 欄位、RLS enabled/forced/policy、fail-open policy expression。
- `--exact-counts` 才執行精確 `COUNT(*)` / `NULL project_id` 掃描。
- `scripts/ops/awooop-rls-preflight.sh`
- 預設透過 `wooo@192.168.0.120` 執行 `sudo kubectl -n awoooi-prod exec deployment/awoooi-api -c api -- python -`
- 支援 `--local``--json``--exact-counts`
- exit `2` 表示 RLS gate blocked不可啟用 RLS。
- `docs/runbooks/AWOOOP-RLS-PREFLIGHT.md`
- 記錄 2026-05-12 production preflight 結果與 remediation order。
**RLS live preflight 結果**
- `bash scripts/ops/awooop-rls-preflight.sh --exact-counts` → exit `2`,符合 blocked gate。
- `PASS=5 WARN=0 BLOCKED=2`
- PASS
- current DB user `awoooi` 不是 superuser / bypassrls。
- `set_config('app.project_id', 'awoooi', TRUE)` 可用。
- 所有已存在 target tables 都有 `project_id`
- production DB 目前沒有 fail-open policy expression。
- exact counts 顯示已存在 target tables `NULL project_id = 0`
- BLOCKED
- `awooop_app``awooop_platform_admin``awooop_migration` roles 不存在。
- target tables 尚未 RLS enabled / forced / policied。
- 判讀:下一步不是回填資料,而是 role bootstrap + DB access path audit + staged policy enablement目前 production app user 是 `awoooi`policy 設計必須先決定是 grant `awooop_app` membership 還是切 connection role。
**新增 188 registry certbot 修復包**
- `scripts/ops/188-registry-certbot-fix.sh`
- root-only helper預設 dry-run必須 `--apply` 才會改 188。
- 建立 `/var/www/certbot`
- 安裝 `/etc/nginx/conf.d/registry-acme-http.conf`,讓 `registry.wooo.work` HTTP-01 不再落到 `aiops.wooo.work` default vhost。
- `nginx -t` 後 reload。
-`/snap/bin/certbot renew --cert-name registry.wooo.work` renew。
- snap certbot 存在時停用 broken apt `certbot.timer` 並 reset failed apt certbot service。
- `docs/runbooks/REGISTRY-CERTBOT-188.md`
- 記錄 expired cert、錯誤 route、apt/snap certbot owner split以及 post-fix 驗證命令。
**驗證**
- `python3 -m py_compile scripts/ops/awooop_rls_preflight.py` → passed。
- `bash -n scripts/ops/awooop-rls-preflight.sh scripts/ops/188-registry-certbot-fix.sh` → passed。
- `scripts/ops/188-registry-certbot-fix.sh` dry-run → 印出預期動作,未修改本機或 188。
- RLS preflight 已對 production API pod 跑通blocked 結果符合預期,未改 DB。
- 已同步 helper 到 188 `/home/ollama/awoooi-ops/188-registry-certbot-fix.sh`
- 188 remote `bash -n` passedremote dry-run 印出預期 root actions未改 Nginx / certbot。
**下一步**
- 由具 sudo 權限的 operator 在 188 執行 `sudo /home/ollama/awoooi-ops/188-registry-certbot-fix.sh --apply`
- RLS 先做 role bootstrap 設計審查,再產出 batch migration不可直接套既有 RLS migration。
## 2026-05-12 | Wave 1 Claude P0 紅燈驗證與 GitHub CD 封堵
**背景**Claude Code 盤點只能作為候選清單,必須逐項用 production DB、主機狀態、provider logs、repo artifacts 驗證;本輪先處理可快速證實且風險高的紅燈。

View File

@@ -0,0 +1,88 @@
# AwoooP RLS Preflight Runbook
> Purpose: verify whether production is ready for PostgreSQL Row-Level Security
> without enabling RLS or changing data.
## Command
Default path runs the probe inside the production API pod through the 120
control-plane host. `DATABASE_URL` stays inside Kubernetes and is not printed.
```bash
bash scripts/ops/awooop-rls-preflight.sh
```
Before enabling RLS, run exact backfill counts:
```bash
bash scripts/ops/awooop-rls-preflight.sh --exact-counts
```
Useful variants:
```bash
bash scripts/ops/awooop-rls-preflight.sh --json
bash scripts/ops/awooop-rls-preflight.sh --local
AWOOOP_RLS_SSH_TARGET=wooo@192.168.0.120 bash scripts/ops/awooop-rls-preflight.sh
```
Exit code `2` means the gate is blocked and RLS must not be enabled yet.
## 2026-05-12 Production Result
`--exact-counts` returned:
- `PASS current_role_rls_enforced`: current DB user is `awoooi`, not superuser and not `BYPASSRLS`.
- `PASS project_context_set_config`: `set_config('app.project_id', 'awoooi', TRUE)` works in the API pod.
- `BLOCKED required_roles`: `awooop_app`, `awooop_platform_admin`, and `awooop_migration` do not exist.
- `PASS project_id_columns`: every existing target table has `project_id`.
- `BLOCKED rls_enabled_forced_policy`: existing target tables are not yet RLS enabled, forced, or policied.
- `PASS fail_open_policies`: production DB currently has no fail-open policy expressions.
- `PASS project_id_backfill`: exact counts found zero `NULL project_id` rows in counted target tables.
Current blocker summary:
```text
PASS=5 WARN=0 BLOCKED=2
```
Important exact counts from the same run:
| Table | Rows | NULL project_id |
| --- | ---: | ---: |
| `audit_logs` | 686 | 0 |
| `awooop_mcp_tool_registry` | 4 | 0 |
| `awooop_outbound_message` | 228 | 0 |
| `awooop_projects` | 2 | 0 |
| `awooop_run_state` | 106 | 0 |
| `incidents` | 1518 | 0 |
| `knowledge_entries` | 2099 | 0 |
| `playbooks` | 220 | 0 |
## Remediation Order
1. Create or reconcile RLS roles.
- Current production app user is `awoooi`; policy design must either grant it
membership in `awooop_app` or update the application connection role before
policies are enforced.
- Do not create passworded LOGIN roles in a migration unless the K8s Secret
rotation path is ready.
2. Verify all DB access paths use `get_db()` / `get_db_context()` or otherwise set
`app.project_id` before queries.
3. Apply policies first in staging or a canary DB.
4. In production, enable one batch at a time.
5. After each batch, run:
```bash
bash scripts/ops/awooop-rls-preflight.sh --exact-counts
```
6. Validate AwoooP Runs, Approvals, Monitoring, Tickets, Cost, alert ingestion,
background workers, and TelegramGateway mirror paths.
## Do Not
- Do not enable all policies in production before the role path is decided.
- Do not rely on fail-open `IS NULL` or empty-string policies as the target state.
- Do not run destructive rollback SQL unless the incident commander explicitly
approves it.

View File

@@ -0,0 +1,62 @@
# 188 Registry Certbot Recovery
> Scope: `registry.wooo.work` on host `192.168.0.188`.
## Verified State On 2026-05-12
- `registry.wooo.work` certificate expired at `May 8 04:16:08 2026 GMT`.
- HTTP-01 route check:
```text
http://registry.wooo.work/.well-known/acme-challenge/codex-route-check
-> 301 https://aiops.wooo.work/.well-known/acme-challenge/codex-route-check
-> 404
```
- `/usr/bin/certbot` is broken by Python/OpenSSL mismatch.
- `/snap/bin/certbot` exists and should be the renewal owner.
- Both apt `certbot.timer` and snap `snap.certbot.renew.timer` were enabled.
- The `ollama` SSH user is in sudo group but has no passwordless sudo in this
session, so Codex could not apply the root-level fix directly.
## Fix Script
The repo includes a root-only helper. It is dry-run by default:
```bash
bash scripts/ops/188-registry-certbot-fix.sh
```
To apply on 188:
```bash
sudo bash /home/ollama/awoooi-ops/188-registry-certbot-fix.sh --apply
```
The script:
- creates `/var/www/certbot`;
- installs `/etc/nginx/conf.d/registry-acme-http.conf`;
- routes `registry.wooo.work` HTTP-01 to `/var/www/certbot`;
- reloads Nginx after `nginx -t`;
- renews `registry.wooo.work` via `/snap/bin/certbot`;
- disables the broken apt `certbot.timer` when snap certbot is present;
- prints the renewed certificate dates.
## Post-Fix Verification
Run from any host with network access:
```bash
curl -sI --max-redirs 0 http://registry.wooo.work/.well-known/acme-challenge/codex-route-check
openssl s_client -servername registry.wooo.work -connect registry.wooo.work:443 </dev/null 2>/dev/null \
| openssl x509 -noout -subject -issuer -dates
```
Expected:
- HTTP challenge path returns `404` from the `registry.wooo.work` vhost, not a
redirect to `aiops.wooo.work`.
- `notAfter` is renewed to a future date.
- `systemctl --failed` no longer lists apt `certbot.service` after failed state
reset.

View File

@@ -0,0 +1,117 @@
#!/usr/bin/env bash
# Repair helper for 188 registry.wooo.work HTTP-01 renewal.
# Default is dry-run. Use --apply on 188 as root after reviewing the plan.
set -euo pipefail
APPLY=0
DOMAIN="${REGISTRY_CERTBOT_DOMAIN:-registry.wooo.work}"
WEBROOT="${REGISTRY_CERTBOT_WEBROOT:-/var/www/certbot}"
NGINX_SNIPPET="${REGISTRY_CERTBOT_NGINX_SNIPPET:-/etc/nginx/conf.d/registry-acme-http.conf}"
CERTBOT_BIN="${REGISTRY_CERTBOT_BIN:-/snap/bin/certbot}"
usage() {
cat <<'USAGE'
Usage: sudo bash scripts/ops/188-registry-certbot-fix.sh [--apply]
Fixes the known 188 drift where registry.wooo.work HTTP-01 traffic falls through
to the aiops.wooo.work default server and certbot cannot renew the registry cert.
Default mode is dry-run and prints the exact actions. --apply requires root.
Environment:
REGISTRY_CERTBOT_DOMAIN Default: registry.wooo.work
REGISTRY_CERTBOT_WEBROOT Default: /var/www/certbot
REGISTRY_CERTBOT_NGINX_SNIPPET Default: /etc/nginx/conf.d/registry-acme-http.conf
REGISTRY_CERTBOT_BIN Default: /snap/bin/certbot
USAGE
}
while [ "$#" -gt 0 ]; do
case "$1" in
--apply)
APPLY=1
;;
-h|--help)
usage
exit 0
;;
*)
echo "Unknown argument: $1" >&2
usage >&2
exit 64
;;
esac
shift
done
run() {
if [ "$APPLY" -eq 1 ]; then
"$@"
else
printf 'DRY-RUN:'
printf ' %q' "$@"
printf '\n'
fi
}
write_snippet() {
local tmp
tmp="$(mktemp)"
cat > "$tmp" <<EOF
# Managed by AWOOOI registry certbot repair.
# LetsEncrypt HTTP-01 must not fall through to aiops.wooo.work.
server {
listen 80;
server_name ${DOMAIN};
location /.well-known/acme-challenge/ {
root ${WEBROOT};
default_type "text/plain";
}
location / {
return 301 https://\$host\$request_uri;
}
}
EOF
run install -m 0644 "$tmp" "$NGINX_SNIPPET"
rm -f "$tmp"
}
if [ "$APPLY" -eq 1 ] && [ "$(id -u)" -ne 0 ]; then
echo "--apply must be run as root on 188" >&2
exit 77
fi
if [ "$APPLY" -eq 1 ] && [ ! -x "$CERTBOT_BIN" ]; then
echo "certbot binary not executable: $CERTBOT_BIN" >&2
exit 69
fi
echo "Plan: repair HTTP-01 route for ${DOMAIN}, renew via ${CERTBOT_BIN}, reload nginx."
run install -d -m 0755 "$WEBROOT"
write_snippet
run nginx -t
run systemctl reload nginx
if [ "$APPLY" -eq 1 ]; then
code="$(curl -s -o /dev/null -w '%{http_code}' --max-time 8 "http://${DOMAIN}/.well-known/acme-challenge/codex-route-check" || true)"
if [ "$code" != "404" ]; then
echo "Unexpected ACME route status after nginx reload: ${code}; expected 404 from ${DOMAIN}, not redirect/default vhost" >&2
exit 1
fi
fi
run "$CERTBOT_BIN" renew --cert-name "$DOMAIN" --deploy-hook "systemctl reload nginx"
if [ -x /snap/bin/certbot ]; then
run systemctl disable --now certbot.timer
run systemctl reset-failed certbot.service
fi
if [ "$APPLY" -eq 1 ]; then
openssl x509 -noout -subject -issuer -dates -in "/etc/letsencrypt/live/${DOMAIN}/fullchain.pem"
systemctl status snap.certbot.renew.timer --no-pager -l | sed -n '1,25p' || true
else
echo "Dry-run only. Re-run with --apply on 188 as root to execute."
fi

View File

@@ -0,0 +1,100 @@
#!/usr/bin/env bash
# Read-only AwoooP RLS preflight runner.
#
# Default path runs inside the production API pod through the 120 control-plane
# host, so DATABASE_URL stays inside Kubernetes and is never printed locally.
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PY_SCRIPT="${SCRIPT_DIR}/awooop_rls_preflight.py"
NAMESPACE="${AWOOOP_RLS_NAMESPACE:-awoooi-prod}"
DEPLOYMENT="${AWOOOP_RLS_DEPLOYMENT:-deployment/awoooi-api}"
CONTAINER="${AWOOOP_RLS_CONTAINER:-api}"
SSH_TARGET="${AWOOOP_RLS_SSH_TARGET:-wooo@192.168.0.120}"
REMOTE_KUBECTL="${AWOOOP_RLS_REMOTE_KUBECTL:-sudo kubectl}"
KUBECTL="${AWOOOP_RLS_KUBECTL:-kubectl}"
USE_SSH=1
PY_ARGS=()
SSH_OPTS=(-o BatchMode=yes -o ConnectTimeout=8)
usage() {
cat <<'USAGE'
Usage: bash scripts/ops/awooop-rls-preflight.sh [options]
Read-only checks for AwoooP PostgreSQL RLS readiness. The script runs the Python
probe inside the API pod and exits 2 when RLS is not ready to enable.
Options:
--exact-counts Run exact COUNT(*) project_id backfill checks.
--json Print JSON output from the pod.
--local Use local kubectl instead of SSH to 120.
--ssh USER@HOST Override SSH target. Default: wooo@192.168.0.120.
-h, --help Show this help.
Environment:
AWOOOP_RLS_NAMESPACE Default: awoooi-prod
AWOOOP_RLS_DEPLOYMENT Default: deployment/awoooi-api
AWOOOP_RLS_CONTAINER Default: api
AWOOOP_RLS_REMOTE_KUBECTL Default: sudo kubectl
AWOOOP_RLS_KUBECTL Default: kubectl
USAGE
}
while [ "$#" -gt 0 ]; do
case "$1" in
--exact-counts)
PY_ARGS+=(--exact-counts)
;;
--json)
PY_ARGS+=(--json)
;;
--local)
USE_SSH=0
;;
--ssh)
shift
SSH_TARGET="${1:-}"
if [ -z "$SSH_TARGET" ]; then
echo "--ssh requires USER@HOST" >&2
exit 64
fi
USE_SSH=1
;;
-h|--help)
usage
exit 0
;;
*)
echo "Unknown argument: $1" >&2
usage >&2
exit 64
;;
esac
shift
done
if [ ! -f "$PY_SCRIPT" ]; then
echo "Missing Python probe: $PY_SCRIPT" >&2
exit 66
fi
if [ "$USE_SSH" -eq 1 ]; then
printf -v namespace_q "%q" "$NAMESPACE"
printf -v deployment_q "%q" "$DEPLOYMENT"
printf -v container_q "%q" "$CONTAINER"
remote_cmd="${REMOTE_KUBECTL} -n ${namespace_q} exec -i ${deployment_q} -c ${container_q} -- python -"
if [ "${#PY_ARGS[@]}" -gt 0 ]; then
for arg in "${PY_ARGS[@]}"; do
printf -v arg_q "%q" "$arg"
remote_cmd="${remote_cmd} ${arg_q}"
done
fi
ssh "${SSH_OPTS[@]}" "$SSH_TARGET" "$remote_cmd" < "$PY_SCRIPT"
else
if [ "${#PY_ARGS[@]}" -gt 0 ]; then
"$KUBECTL" -n "$NAMESPACE" exec -i "$DEPLOYMENT" -c "$CONTAINER" -- python - "${PY_ARGS[@]}" < "$PY_SCRIPT"
else
"$KUBECTL" -n "$NAMESPACE" exec -i "$DEPLOYMENT" -c "$CONTAINER" -- python - < "$PY_SCRIPT"
fi
fi

View File

@@ -0,0 +1,332 @@
#!/usr/bin/env python3
"""
Read-only AwoooP RLS preflight.
This script is designed to run inside the production API pod. It uses the
pod-local DATABASE_URL and never prints the URL or credentials.
"""
from __future__ import annotations
import argparse
import asyncio
import json
import os
import sys
from dataclasses import asdict, dataclass
from typing import Any
from sqlalchemy import text
from sqlalchemy.ext.asyncio import create_async_engine
TARGET_TABLES = [
"incidents",
"knowledge_entries",
"playbooks",
"audit_logs",
"budget_ledger",
"awooop_projects",
"awooop_contracts",
"awooop_contract_revisions",
"awooop_published_contracts",
"awooop_run_state",
"awooop_run_event",
"awooop_cost_ledger",
"awooop_mcp_tool_registry",
"awooop_mcp_grants",
"awooop_mcp_credential_refs",
"awooop_mcp_gateway_audit",
"awooop_conversation_event",
"awooop_outbound_message",
]
REQUIRED_ROLES = [
"awooop_app",
"awooop_platform_admin",
"awooop_migration",
]
@dataclass
class Check:
name: str
status: str
detail: str
def add(checks: list[Check], name: str, status: str, detail: str) -> None:
checks.append(Check(name=name, status=status, detail=detail))
async def scalar(conn: Any, sql: str, params: dict[str, Any] | None = None) -> Any:
return await conn.scalar(text(sql), params or {})
async def rows(conn: Any, sql: str, params: dict[str, Any] | None = None) -> list[dict[str, Any]]:
result = await conn.execute(text(sql), params or {})
return [dict(row._mapping) for row in result.fetchall()]
async def collect(exact_counts: bool) -> tuple[list[Check], dict[str, Any]]:
database_url = os.environ.get("DATABASE_URL")
if not database_url:
return [Check("database_url", "BLOCKED", "DATABASE_URL is not set in this environment")], {}
engine = create_async_engine(database_url, pool_pre_ping=True)
checks: list[Check] = []
evidence: dict[str, Any] = {}
async with engine.connect() as conn:
current_role = await rows(
conn,
"""
SELECT
current_user AS current_user,
session_user AS session_user,
r.rolsuper AS current_user_superuser,
r.rolbypassrls AS current_user_bypassrls
FROM pg_roles r
WHERE r.rolname = current_user
""",
)
evidence["current_role"] = current_role[0] if current_role else {}
role = evidence["current_role"]
if role.get("current_user_superuser") or role.get("current_user_bypassrls"):
add(
checks,
"current_role_rls_enforced",
"BLOCKED",
f"current_user={role.get('current_user')} can bypass RLS",
)
else:
add(
checks,
"current_role_rls_enforced",
"PASS",
f"current_user={role.get('current_user')} is subject to RLS",
)
before = await scalar(conn, "SELECT current_setting('app.project_id', TRUE)")
await scalar(conn, "SELECT set_config('app.project_id', :pid, TRUE)", {"pid": "awoooi"})
after = await scalar(conn, "SELECT current_setting('app.project_id', TRUE)")
evidence["project_context_probe"] = {"before": before, "after": after}
if after == "awoooi":
add(checks, "project_context_set_config", "PASS", "set_config app.project_id works")
else:
add(checks, "project_context_set_config", "BLOCKED", f"expected awoooi, got {after!r}")
roles = await rows(
conn,
"""
WITH required_roles(rolname) AS (
SELECT jsonb_array_elements_text(CAST(:roles_json AS jsonb))
)
SELECT
rr.rolname,
r.rolsuper,
r.rolbypassrls,
r.oid IS NOT NULL AS exists
FROM required_roles rr
LEFT JOIN pg_roles r ON r.rolname = rr.rolname
ORDER BY rr.rolname
""",
{"roles_json": json.dumps(REQUIRED_ROLES)},
)
evidence["required_roles"] = roles
present_roles = {row["rolname"] for row in roles if row["exists"]}
missing_roles = [role_name for role_name in REQUIRED_ROLES if role_name not in present_roles]
if missing_roles:
add(checks, "required_roles", "BLOCKED", f"missing roles: {', '.join(missing_roles)}")
else:
add(checks, "required_roles", "PASS", "all required RLS roles exist")
table_rows = await rows(
conn,
"""
WITH target(relname) AS (
SELECT jsonb_array_elements_text(CAST(:tables_json AS jsonb))
),
rels AS (
SELECT
t.relname,
c.oid,
c.relrowsecurity,
c.relforcerowsecurity,
COALESCE(c.reltuples, 0)::bigint AS estimated_rows
FROM target t
LEFT JOIN pg_class c
ON c.relname = t.relname
AND c.relkind IN ('r', 'p')
AND c.relnamespace = 'public'::regnamespace
),
project_columns AS (
SELECT table_name, TRUE AS has_project_id
FROM information_schema.columns
WHERE table_schema = 'public'
AND column_name = 'project_id'
AND table_name IN (SELECT relname FROM target)
),
policy_stats AS (
SELECT
p.polrelid,
COUNT(*) AS policy_count,
BOOL_OR(
COALESCE(pg_get_expr(p.polqual, p.polrelid), '') ILIKE '%current_setting(''app.project_id'', true) IS NULL%'
OR COALESCE(pg_get_expr(p.polwithcheck, p.polrelid), '') ILIKE '%current_setting(''app.project_id'', true) IS NULL%'
) AS has_null_fail_open_policy,
BOOL_OR(
COALESCE(pg_get_expr(p.polqual, p.polrelid), '') ILIKE '%current_setting(''app.project_id'', true) = ''''%'
OR COALESCE(pg_get_expr(p.polwithcheck, p.polrelid), '') ILIKE '%current_setting(''app.project_id'', true) = ''''%'
) AS has_empty_string_fail_open_policy
FROM pg_policy p
GROUP BY p.polrelid
)
SELECT
r.relname AS table_name,
r.oid IS NOT NULL AS exists,
COALESCE(pc.has_project_id, FALSE) AS has_project_id,
COALESCE(r.relrowsecurity, FALSE) AS rls_enabled,
COALESCE(r.relforcerowsecurity, FALSE) AS rls_forced,
COALESCE(ps.policy_count, 0) AS policy_count,
COALESCE(ps.has_null_fail_open_policy, FALSE) AS has_null_fail_open_policy,
COALESCE(ps.has_empty_string_fail_open_policy, FALSE) AS has_empty_string_fail_open_policy,
r.estimated_rows
FROM rels r
LEFT JOIN project_columns pc ON pc.table_name = r.relname
LEFT JOIN policy_stats ps ON ps.polrelid = r.oid
ORDER BY r.relname
""",
{"tables_json": json.dumps(TARGET_TABLES)},
)
evidence["tables"] = table_rows
existing = [row for row in table_rows if row["exists"]]
missing_project_id = [row["table_name"] for row in existing if not row["has_project_id"]]
if missing_project_id:
add(checks, "project_id_columns", "BLOCKED", f"missing project_id: {', '.join(missing_project_id)}")
else:
add(checks, "project_id_columns", "PASS", "all existing target tables have project_id")
rls_missing = [
row["table_name"]
for row in existing
if not row["rls_enabled"] or not row["rls_forced"] or row["policy_count"] == 0
]
if rls_missing:
add(
checks,
"rls_enabled_forced_policy",
"BLOCKED",
f"RLS not fully enabled/forced/policied: {', '.join(rls_missing)}",
)
else:
add(checks, "rls_enabled_forced_policy", "PASS", "all existing target tables have forced RLS policy")
fail_open = [
row["table_name"]
for row in existing
if row["has_null_fail_open_policy"] or row["has_empty_string_fail_open_policy"]
]
if fail_open:
add(checks, "fail_open_policies", "BLOCKED", f"fail-open policy expressions: {', '.join(fail_open)}")
else:
add(checks, "fail_open_policies", "PASS", "no fail-open policy expressions detected")
if exact_counts:
exact_rows: list[dict[str, Any]] = []
for row in existing:
if not row["has_project_id"]:
continue
quoted = '"' + row["table_name"].replace('"', '""') + '"'
count_row = await rows(
conn,
f"SELECT :table_name AS table_name, COUNT(*) AS total_rows, COUNT(*) FILTER (WHERE project_id IS NULL) AS null_project_id_rows FROM {quoted}",
{"table_name": row["table_name"]},
)
exact_rows.extend(count_row)
evidence["exact_counts"] = exact_rows
null_tables = [row["table_name"] for row in exact_rows if int(row["null_project_id_rows"]) > 0]
if null_tables:
add(checks, "project_id_backfill", "BLOCKED", f"NULL project_id remains: {', '.join(null_tables)}")
else:
add(checks, "project_id_backfill", "PASS", "no NULL project_id rows in counted tables")
else:
add(checks, "project_id_backfill", "WARN", "exact counts skipped; rerun with --exact-counts before enabling RLS")
await engine.dispose()
return checks, evidence
def print_human(checks: list[Check], evidence: dict[str, Any]) -> None:
blocked = sum(1 for check in checks if check.status == "BLOCKED")
warn = sum(1 for check in checks if check.status == "WARN")
passed = sum(1 for check in checks if check.status == "PASS")
print(f"AwoooP RLS preflight: PASS={passed} WARN={warn} BLOCKED={blocked}")
for check in checks:
print(f"{check.status:<7} {check.name}: {check.detail}")
role = evidence.get("current_role") or {}
if role:
print(
"role "
f"current_user={role.get('current_user')} "
f"session_user={role.get('session_user')} "
f"superuser={role.get('current_user_superuser')} "
f"bypassrls={role.get('current_user_bypassrls')}"
)
for row in evidence.get("tables", []):
print(
"table "
f"{row['table_name']} "
f"exists={row['exists']} "
f"project_id={row['has_project_id']} "
f"rls={row['rls_enabled']} "
f"force={row['rls_forced']} "
f"policies={row['policy_count']} "
f"fail_open_null={row['has_null_fail_open_policy']} "
f"fail_open_empty={row['has_empty_string_fail_open_policy']} "
f"estimated_rows={row['estimated_rows']}"
)
for row in evidence.get("exact_counts", []):
print(
"count "
f"{row['table_name']} "
f"total_rows={row['total_rows']} "
f"null_project_id_rows={row['null_project_id_rows']}"
)
async def main() -> int:
parser = argparse.ArgumentParser(description="Run read-only AwoooP RLS preflight checks.")
parser.add_argument("--exact-counts", action="store_true", help="Run exact COUNT(*) checks for project_id backfill.")
parser.add_argument("--json", action="store_true", help="Print JSON instead of human-readable output.")
args = parser.parse_args()
checks, evidence = await collect(exact_counts=args.exact_counts)
blocked = any(check.status == "BLOCKED" for check in checks)
if args.json:
print(
json.dumps(
{"checks": [asdict(check) for check in checks], "evidence": evidence},
ensure_ascii=False,
default=str,
)
)
else:
print_human(checks, evidence)
return 2 if blocked else 0
if __name__ == "__main__":
try:
raise SystemExit(asyncio.run(main()))
except KeyboardInterrupt:
raise SystemExit(130)
except Exception as exc:
print(f"BLOCKED preflight_exception: {exc}", file=sys.stderr)
raise SystemExit(2)