7.7 KiB
IwoooS Monitoring / Alerting / Observability owner response acceptance
| 項目 | 內容 |
|---|---|
| 日期 | 2026-06-15 |
| 狀態 | owner_response_acceptance_ledger_ready_no_runtime_action |
| 工具 | scripts/security/monitoring-owner-response-acceptance.py |
| Snapshot | docs/security/monitoring-owner-response-acceptance.snapshot.json |
| Source inventory | docs/security/monitoring-alerting-observability-inventory.snapshot.json |
| Source owner request | docs/security/monitoring-owner-request-draft.snapshot.json |
| runtime gate | 0 |
1. 目的
本文件把 Monitoring / Alerting / Observability owner request draft 轉成 owner response acceptance 只讀帳本。目的不是 reload Prometheus / Alertmanager,也不是送 Telegram 或 fire alert,而是固定未來收到 owner 回覆後,必須如何做欄位完整性、脫敏證據、receiver receipt、stale alert、silence / dedup、no-false-green、noise budget、maintenance window、rollback 與 validation plan 檢查。
本階段仍是 metadata-only acceptance ledger:不連 live Prometheus、不 reload Alertmanager、不套用 Grafana / SigNoz、不部署 Sentry、不改 Langfuse、不 reload OTEL、不改 receiver route、不建立 silence、不送 Telegram、不 fire live alert、不跑 alert chain smoke、不 SSH、不 kubectl、不讀 secret value、不寫 production。
2. 摘要
| 指標 | 目前值 | 說明 |
|---|---|---|
| acceptance candidate | 60 |
全部由 committed owner request draft 轉換 |
| write-capable candidate | 11 |
可能 reload、deploy、send notification、fire alert 或 restart exporter 的 surface |
| live evidence required candidate | 60 |
每份 candidate 都需要 owner 提供脫敏 live evidence ref |
| acceptance field | 38 |
每份候選固定欄位數 |
| required owner field | 14 |
owner 必須補齊的欄位數 |
| reviewer check | 23 |
收件、隔離、拒收、補件與 reviewer 分流檢查 |
| outcome lane | 12 |
等待、隔離、拒收、補件、no-false-green、receipt gap、stale / silence review、post-reload readback、只讀更新、runtime gate 等結果 |
| blocked action | 34 |
驗收前全部禁止 |
| request sent / recipient confirmed | 0 / 0 |
尚未送件,收件人也未確認 |
| owner response received / accepted / rejected | 0 / 0 / 0 |
不得假性拉高 |
| live evidence / reload / receiver / route smoke accepted | 0 / 0 / 0 / 0 |
未授權、未執行 |
| receiver receipt / stale / silence / false-green / post-reload accepted | 0 / 0 / 0 / 0 / 0 |
未收到 reviewer record,不得用 route 200 或 UI 可見替代 |
| Telegram send / alert chain smoke / runtime gate | 0 / 0 / 0 |
無發送、無實測、無執行入口 |
3. Reviewer 檢查
| 檢查 | 說明 |
|---|---|
| owner 身分 | owner role / team 必須可追溯 |
| decision / reason | 判定與理由必須同時存在,且不得包含機敏值 |
| affected scope | 必須能對回 committed surface_id |
| redacted evidence | 只能是脫敏 ref、hash、ticket、commit 或 artifact pointer |
| secret absence | 不得出現 token、Bot token、DSN secret、cookie、private key、env dump 或 partial secret |
| live config hash | 只能是 owner-provided metadata ref,不得貼 raw config |
| reload owner | reload / deploy 類變更必須有 owner |
| receiver owner | receiver route、Telegram receipt 與 notification policy 必須有 owner |
| route smoke plan | route smoke / receipt proof 必須是計畫或脫敏證據 ref,不得直接 fire alert |
| incident context | 事故回補必須有 incident / change / outage context ref,不得只寫服務已恢復 |
| alert chain health | 告警鏈路健康不得只用 public route 200、容器 up、dashboard up 或 UI 可見判定 |
| receiver receipt | receipt proof 只能是脫敏 ref、hash、message id 或 ticket,不得貼 raw notification payload |
| stale alert | 必須確認告警是否 stale、pending、resolved 未清或資料來源停止更新 |
| silence / dedup | 必須確認 silence、mute、dedup、inhibit 或 maintenance rule 是否造成 false green |
| false-green risk | 必須列 no-false-green 判定,避免把 route up、容器 up 或 dashboard up 當作告警鏈路 up |
| post-reload readback | 若後續有 reload / deploy,必須先有 post-reload readback plan 與 stop condition |
| cross-project notification | 若影響 AwoooP、IwoooS、agent-bounty、StockPlatform、公開網站或監控,需有跨專案通知 ref |
| noise budget | 告警噪音、silence、dedup 與測試通知必須有 owner |
| maintenance window | reload、deploy、route change、smoke 或 notification send 必須另有窗口 |
| rollback owner | rollback owner、rollback ref 或 disable path 必須存在 |
| validation plan | 必須列 route、receipt、alert state、metrics 與 rollback stop condition |
| execution request | 夾帶 reload、receiver route change、Telegram send、alert smoke、SSH 或 kubectl 要求時拒收 |
| count transition | 只有 reviewer record 可更新 received / accepted / rejected;不得同時開 runtime gate |
4. 禁止動作
prometheus_reload
alertmanager_reload
grafana_dashboard_apply
signoz_rule_apply
sentry_deploy
langfuse_config_change
otel_collector_reload
receiver_route_change
silence_policy_change
telegram_send
notification_route_change
webhook_receiver_change
remote_write_change
exporter_deploy
live_alert_fire
alert_chain_smoke
ssh_read
ssh_write
kubectl_action
secret_value_collection
host_write
active_scan
production_write
runtime_gate_open
raw_monitoring_payload_storage
accept_secret_value_evidence
mark_owner_response_accepted_without_reviewer_record
mark_route_200_as_alert_chain_healthy
mark_receiver_healthy_without_receipt
accept_silence_without_owner
accept_stale_alert_without_review
accept_reload_without_postcheck
store_raw_alert_payload
add_action_button
5. 指令
固定 committed snapshot 時間:
python3 scripts/security/monitoring-owner-response-acceptance.py \
--root . \
--owner-request-report docs/security/monitoring-owner-request-draft.snapshot.json \
--output docs/security/monitoring-owner-response-acceptance.snapshot.json \
--generated-at 2026-06-15T18:20:00+08:00
只讀 guard:
python3 scripts/security/iwooos-config-control-guard.py --root .
python3 scripts/security/security-mirror-progress-guard.py --root .
python3 scripts/security/source-control-owner-response-guard.py --root .
6. 完成度
| 工作 | 完成度 | 說明 |
|---|---|---|
| owner response acceptance artifact | 100% |
60 份 candidate、snapshot、文件與 guard 已固定 |
| no-false-green backfill gate | 100% |
23 個 reviewer checks、12 條 outcome lanes、34 類 blocked action 已固定 |
| request dispatch | 0% |
尚未送件,recipient 未確認 |
| owner response received / accepted | 0% |
尚未收到或接受任何 owner response |
| live evidence collection | 0% |
未讀 live monitoring stack |
| reload / receiver / Telegram / alert chain smoke | 0% |
未授權、未執行 |
| runtime / production write | 0% |
無 action button,無 production write |
7. 邊界
這份帳本不是 live alert chain truth、不是 Telegram delivery proof、不是 reload approval,也不是 monitoring runtime approval。不得把 owner response acceptance ledger、snapshot、LOGBOOK、IwoooS UI、public route 200、dashboard up、container up 或 AwoooP approval 解讀成 Prometheus reload、Alertmanager reload、Grafana import、SigNoz apply、Sentry deploy、Langfuse change、OTEL reload、receiver route change、silence change、Telegram send、live alert fire、alert chain smoke、SSH、kubectl、active scan、secret collection、host write 或 production write 授權。