docs(awooop): record t14b auto approved evidence link
This commit is contained in:
@@ -7751,3 +7751,87 @@ auto_repair_24h=6
|
||||
- 目前 production smoke 沒有新的 auto-repair 事件可驗證 fallback 寫入,因此仍不能宣稱完整閉環;這是正確保守判讀。
|
||||
- 下一步 T14b:等下一筆 `auto_repair=true` 事件或設計安全 live-fire,驗證 `auto_repair_executions -> incident_evidence.verification_result -> learning/KM -> truth-chain auto_repaired_verified` 是否全鏈路成立;同時補 auto-approved approval execution 的 incident linkage / durable execution record。
|
||||
- 目前整體進度更新:約 80%。
|
||||
|
||||
### 2026-05-13 — AwoooP truth-chain T14b:auto-approved execution 補 incident linkage 與 durable evidence(production deployed)
|
||||
|
||||
**live diagnosis**:
|
||||
|
||||
- CS2 `auto_approve_rule_engine` 與 CS3 `auto_approve_llm_cs3` 的高信心自動執行路徑,是先呼叫 `ApprovalExecutionService.execute_approved_action()`,再建立 incident。
|
||||
- executor 執行當下沒有 `incident_id`,因此 post-execution verifier、KM writeback、incident resolve、`auto_repair_executions` 都無法串回同一張告警。
|
||||
- CS3 另有一個實際斷點:auto approval 沒有把 DB 內 `approval.id` 帶給 executor,會讓執行狀態回寫到錯的 transient id。
|
||||
|
||||
**變更**:
|
||||
|
||||
- `ApprovalExecutionService.finalize_auto_approved_execution()` 新增為「不重跑 action,只補證據鏈」的收斂點:
|
||||
- 寫入 `auto_repair_executions`,`triggered_by=auto_approve_*`。
|
||||
- 補 incident-linked timeline event。
|
||||
- 以自動修復模式寫 KM。
|
||||
- 呼叫 `PostExecutionVerifier`,`action_taken=auto_repair_playbook:*`,讓 fallback evidence 可取得 `matched_playbook_id`。
|
||||
- 成功後 resolve incident。
|
||||
- `NO_ACTION` / `OBSERVE` / `INVESTIGATE` 不算自動修復,避免 KPI 污染。
|
||||
- CS2 / CS3 在 incident 建立與 `update_incident_id()` 後呼叫 finalize。
|
||||
- CS3 補 `_cs3_auto_approval.id = approval.id` 與 `service.update_execution_status()`。
|
||||
- `requested_by` 判斷從只接受 `auto_approve` 改成接受 `auto_approve*`,避免 `auto_approve_rule_engine` / `auto_approve_llm_cs3` 被 KM 誤標成人工修復。
|
||||
|
||||
**local verification**:
|
||||
|
||||
```text
|
||||
python3 -m py_compile apps/api/src/services/approval_execution.py apps/api/src/api/v1/webhooks.py apps/api/tests/test_approval_execution_auto_approved_finalize.py
|
||||
OK
|
||||
|
||||
ruff check --select F821 apps/api/src/services/approval_execution.py apps/api/src/api/v1/webhooks.py apps/api/tests/test_approval_execution_auto_approved_finalize.py
|
||||
OK
|
||||
|
||||
pytest tests/test_approval_execution_auto_approved_finalize.py tests/test_approval_execution_no_action.py tests/test_learning_chain_e2e.py tests/test_awooop_truth_chain_service.py -q
|
||||
26 passed
|
||||
|
||||
pytest tests/test_post_execution_verifier.py tests/test_learning_chain_e2e.py tests/test_awooop_truth_chain_service.py tests/test_platform_router_order.py tests/test_cs1_auto_execute.py tests/test_cs3_auto_execute.py tests/test_approval_execution_auto_approved_finalize.py -q
|
||||
77 passed
|
||||
|
||||
pytest tests/test_rule_engine_auto_execute.py tests/test_alertmanager_rule_bypass.py tests/test_approval_execution_auto_approved_finalize.py -q
|
||||
31 passed
|
||||
```
|
||||
|
||||
**production deploy / smoke(完成)**:
|
||||
|
||||
```text
|
||||
Commit: 596f2f68 fix(awooop): link auto approved execution evidence
|
||||
Gitea:
|
||||
2066 code-review 596f2f68 -> success
|
||||
2065 CD Pipeline 596f2f68 -> success
|
||||
tests -> success
|
||||
build-and-deploy -> success
|
||||
post-deploy-checks -> success
|
||||
Deploy marker: edba52f4 chore(cd): deploy 596f2f6 [skip ci]
|
||||
|
||||
K8s image:
|
||||
awoooi-api 192.168.0.110:5000/awoooi/api:596f2f682094d0916f6a18a6f50e7667e4ca86ff
|
||||
awoooi-worker 192.168.0.110:5000/awoooi/api:596f2f682094d0916f6a18a6f50e7667e4ca86ff
|
||||
awoooi-web 192.168.0.110:5000/awoooi/web:596f2f682094d0916f6a18a6f50e7667e4ca86ff
|
||||
|
||||
health:
|
||||
https://awoooi.wooo.work/api/v1/health -> 200
|
||||
|
||||
quality summary, hours=24, limit=30:
|
||||
verified_auto_repair_total=0
|
||||
production_claim.can_claim_full_auto_repair=false
|
||||
by_verdict:
|
||||
manual_required_no_action=17
|
||||
received_only=12
|
||||
approval_required=1
|
||||
|
||||
DB baseline after deploy time 2026-05-13T11:19:27Z:
|
||||
auto_repair_since_deploy=0
|
||||
auto_approved_since_deploy=0
|
||||
verified_evidence_since_deploy=0
|
||||
auto_repair_24h=5
|
||||
auto_approved_24h=0
|
||||
verified_evidence_24h=0
|
||||
```
|
||||
|
||||
判讀:
|
||||
|
||||
- T14b 已完成並推版:下一筆 CS2/CS3 auto-approved real execution 會留下 incident-linked `auto_repair_executions`、timeline、KM、verifier evidence,不再只停留在 Telegram / log。
|
||||
- production smoke 尚未出現部署後新的 auto-approved 或 auto-repair live event,因此仍不能宣稱完整閉環已被 production live-fire 證明。
|
||||
- 下一步 T14c:用安全 live-fire 或等待自然告警,驗證 `auto_approve_* -> auto_repair_executions -> incident_evidence.verification_result -> learning/KM -> truth-chain auto_repaired_verified` 實際打通;並把 Telegram 卡片改成明確顯示「目前跑到哪個節點 / 是否已自動修復 / 是否轉人工」。
|
||||
- 目前整體進度更新:約 82%。
|
||||
|
||||
@@ -2040,6 +2040,14 @@ Phase 6 完成後
|
||||
- Smoke:quality summary 仍為 `verified_auto_repair_total=0`、`production_claim=false`;deploy 後尚無新 auto-repair 事件(`auto_repair_since_deploy=0`),所以不能宣稱完整閉環,只能宣稱「未來 auto-repair verifier 結果會有 durable evidence target」。
|
||||
- 下一步 T14b:等待下一筆 `auto_repair=true` 事件或設計安全 live-fire,驗證 `auto_repair_executions -> incident_evidence.verification_result -> learning/KM -> truth-chain auto_repaired_verified` 全鏈路;並補 auto-approved approval execution 的 incident linkage / durable execution record。
|
||||
|
||||
**T14b auto-approved execution incident linkage production deployed(2026-05-13 台北)**:
|
||||
- 觸發:CS2 `auto_approve_rule_engine` 與 CS3 `auto_approve_llm_cs3` 會先執行 action、再建立 incident;executor 當下沒有 `incident_id`,導致 `auto_repair_executions`、timeline、KM、PostExecutionVerifier、incident resolve 無法串回同一事件。CS3 另缺 `_cs3_auto_approval.id = approval.id`,會讓 execution status 回寫到 transient id。
|
||||
- 修正:新增 `ApprovalExecutionService.finalize_auto_approved_execution()`,在 incident 建立後補 durable trace,不重新執行 action;內容包含 `auto_repair_executions(triggered_by=auto_approve*)`、incident-linked timeline、KM、`PostExecutionVerifier(action_taken=auto_repair_playbook:*)`、成功後 resolve incident。`NO_ACTION` / `OBSERVE` / `INVESTIGATE` 不算自動修復。
|
||||
- Webhook:CS2 / CS3 在 `update_incident_id()` 後呼叫 finalize;CS3 補 DB approval id 與 `update_execution_status()`;`requested_by` 判斷改為 `auto_approve*`,避免 `auto_approve_rule_engine` / `auto_approve_llm_cs3` 被誤標成人工修復。
|
||||
- Production:`596f2f68 fix(awooop): link auto approved execution evidence` 已推 Gitea main;Gitea run `2066` code-review success、run `2065` tests/build-and-deploy/post-deploy-checks 全 success;deploy marker `edba52f4`;API/Worker image `192.168.0.110:5000/awoooi/api:596f2f682094d0916f6a18a6f50e7667e4ca86ff`,Web image `192.168.0.110:5000/awoooi/web:596f2f682094d0916f6a18a6f50e7667e4ca86ff`,health 200。
|
||||
- Smoke:quality summary 仍為 `verified_auto_repair_total=0`、`production_claim=false`;deploy 後尚無新 auto-approved 或 auto-repair live event(`auto_repair_since_deploy=0`、`auto_approved_since_deploy=0`、`verified_evidence_since_deploy=0`),所以仍不能宣稱完整閉環已 production live-fire verified。
|
||||
- 下一步 T14c:用安全 live-fire 或等待自然告警驗證 `auto_approve_* -> auto_repair_executions -> incident_evidence.verification_result -> learning/KM -> truth-chain auto_repaired_verified`;並把 Telegram 卡片改為明確顯示流程節點、是否自動修復、是否轉人工。
|
||||
|
||||
---
|
||||
|
||||
### 2026-04-20 晚 (台北) — C1-C4 全流程串接 — Playbook 鏈路保護(commit de2d34d)
|
||||
|
||||
Reference in New Issue
Block a user