接入 PChome controlled apply drift smoke

2026-07-02 13:03:07 +08:00
parent 72bd8e1e54
commit 6caf818ba3
4 changed files with 159 additions and 7 deletions
--- a/docs/AI_INTELLIGENCE_MODULE_SOT.md
+++ b/docs/AI_INTELLIGENCE_MODULE_SOT.md
@@ -84,6 +84,7 @@
 - V10.643 起 `/ai_intelligence` 的商品明細上方必須提供「商品策略分流」視覺摘要，至少包含價格壓力、價格優勢、待確認、缺比價四類；每一類需顯示件數、近 7 天業績與比例條，且可點擊切換明細。舊 KPI 卡也不得是靜態數字，需可導向全部商品、可處理商品、高風險比價或處理紀錄。
 - V10.725 source-ready 起 PChome growth 必須提供 `/api/ai/pchome-growth/ai-automation-readiness` 與 Dashboard「AI 主流程」狀態列；同一摘要要聚合缺口偵測、同款搜尋包、候選決策包、證據收據與受控落地，並明確輸出 `primary_human_gate_count=0`、`automation_policy.primary_flow=ai_controlled`。PChome mapping 不得把 AI 例外決策當主流程；所有例外都要進 AI machine-verifiable auto-resolution，產生 failure reasons、下一個機器動作與 rollback/readback 路徑。
 - 2026-07-02 起 PChome AI 自動化主線工作順序以 `docs/guides/pchome_ai_automation_priority_backlog.md` 為可執行 backlog；使用者中途插入的 production truth、版本不得錯、GitHub freeze、推版到 Gitea/正式環境、AI 自動化取代人工主流程、外部專業 benchmark、主流專業產品網站、實作結果與完整優先順序要求，都必須列入 backlog 並依 P0/P1/P2/P3/P4 推進。未在 backlog 的支線不得蓋過 P0 runtime truth / controlled apply closure。
+- 2026-07-02 起 AI automation smoke 必須例行執行 PChome controlled-apply drift monitor；`PChome 受控落地 drift monitor` 會以 read-only 方式重放 receipt replay + drift verifier，將 drift detected 或 verifier write-risk 升為 `critical`，並在 `/api/ai-automation/smoke` 與每日 smoke 摘要中回報 selector/readback/drift/artifact hash 狀態。
 - V10.644 起 `/ai_intelligence` 的商品明細列不得只用句子描述比價；每列必須顯示 PChome 價格、MOMO 參考價、差距、可信度四格價格證據，並保留下一步按鈕。單位價候選需顯示單位價與單位，候選待確認或缺資料則以「待補 / 候選待確認」呈現，不得捏造價格。
 - V10.645 起 `/ai_intelligence` 的商品明細分流切換後，必須顯示「這類商品怎麼處理」的行動摘要，包含件數、近 7 天業績、平均可信度、最大價差、代表商品與主按鈕；使用者不得只能看到商品列表而不知道下一步。
 - V10.646 起 `/ai_intelligence` 的商品明細必須提供搜尋與排序；搜尋至少涵蓋商品、分類、商品編號與 MOMO 候選資訊，排序至少支援優先級、近 7 天業績、價差、下滑幅度與可信度。搜尋/排序後的行動摘要與明細列表必須使用同一批結果。
--- a/docs/guides/pchome_ai_automation_priority_backlog.md
+++ b/docs/guides/pchome_ai_automation_priority_backlog.md
@@ -62,6 +62,10 @@
 - Drift verifier artifact receipt 已完成:
  - materialized artifact count `1`
  - hash match count `1`
+- Automated drift monitor / smoke path 已完成:
+  - `PChome 受控落地 drift monitor` 已納入 AI automation smoke checks
+  - 每次 `/api/ai-automation/smoke` 與每日 smoke 摘要都會例行執行 read-only drift verifier
+  - drift detected 或 verifier write-risk 會升為 `critical`
 - AI debt scanner 顯示產品面清空:
  - `PRODUCT_SURFACE_CLEAR`
  - `finding_count=0`
@@ -69,10 +73,9 @@

 進行中 / 下一步，必須照順序:

-1. 建立 automated drift-verifier monitor / smoke path，讓 verifier 不是只可手動呼叫，而是能被例行檢查。
-2. 建立 drift rollback / re-apply recommendation package，未來出現 drift 時可自動給出修復方案與回滾證據。
-3. 建立正式環境 compact readback endpoint，回傳最新 apply / replay / drift receipts。
-4. 建立 PChome controlled-apply artifacts retention policy，讓 evidence 可追蹤但不無限制膨脹。
+1. 建立 drift rollback / re-apply recommendation package，未來出現 drift 時可自動給出修復方案與回滾證據。
+2. 建立正式環境 compact readback endpoint，回傳最新 apply / replay / drift receipts。
+3. 建立 PChome controlled-apply artifacts retention policy，讓 evidence 可追蹤但不無限制膨脹。

 完成標準:

@@ -184,8 +187,8 @@
 | P0.4 | Product readiness visibility | 已完成 | `AI_AUTOMATION_CONTROLLED_APPLY_CLOSEOUT_VERIFIED` | 接到 UI first viewport |
 | P0.5 | Drift verifier | 已完成 | `DRIFT_VERIFIED`, `drift_count=0`, readback `4/4` | 建立 monitor / smoke |
 | P0.6 | Drift verifier artifact | 已完成 | drift artifact hash match `1` | 增加 latest compact readback |
-| P0.7 | Automated drift monitor | 未開始 | none | 下一個實作 |
-| P0.8 | Drift rollback / re-apply package | 未開始 | none | P0.7 後實作 |
+| P0.7 | Automated drift monitor | 已完成 | smoke check `PChome 受控落地 drift monitor` | 納入每日 smoke 與 runtime readback |
+| P0.8 | Drift rollback / re-apply package | 未開始 | none | 下一個實作 |
 | P1.1 | Dashboard AI automation first-viewport surface | 未開始 | API readiness exists | P0 monitor 後實作 |
 | P1.2 | UI wording guard for no raw engineering terms | 未開始 | existing guardrails only | 為新 automation surface 補 tests |
 | P2.1 | External benchmark encoded into requirements | 未開始 | benchmark guide exists | 更新 guardrails / tests |
--- a/services/ai_automation_smoke_service.py
+++ b/services/ai_automation_smoke_service.py
@@ -26,6 +26,9 @@ _HISTORY_PATH = os.getenv(
 )
 _HISTORY_LIMIT = int(os.getenv("MOMO_AI_AUTOMATION_SMOKE_HISTORY_LIMIT", "200"))
 _HISTORY_LOCK = threading.Lock()
+_PCHOME_DRIFT_MONITOR_DB_STATEMENT_TIMEOUT_MS = int(
+    os.getenv("MOMO_PCHOME_DRIFT_MONITOR_DB_STATEMENT_TIMEOUT_MS", "5000")
+)


 def _check(name: str, status: str, summary: str, details: Dict[str, Any] | None = None) -> Dict[str, Any]:
@@ -258,6 +261,17 @@ def _row_mapping(row: Any) -> Dict[str, Any]:
    return {}


+def _create_pchome_drift_monitor_engine(database_path: str):
+    from sqlalchemy import create_engine
+
+    engine_kwargs: Dict[str, Any] = {}
+    if str(database_path).startswith(("postgresql://", "postgresql+psycopg2://", "postgres://")):
+        engine_kwargs["connect_args"] = {
+            "options": f"-c statement_timeout={_PCHOME_DRIFT_MONITOR_DB_STATEMENT_TIMEOUT_MS}"
+        }
+    return create_engine(database_path, **engine_kwargs)
+
+
 def _gemini_egress_check(window_hours: int = 24) -> Dict[str, Any]:
    """Read-only runtime sentinel for unexpected Gemini spend."""
    session = None
@@ -478,6 +492,86 @@ def _elephant_hitl_check() -> Dict[str, Any]:
        return _check("ElephantAlpha AI 例外決策", "critical", f"ElephantAlpha smoke 失敗：{exc}")


+def _pchome_controlled_apply_drift_monitor_check() -> Dict[str, Any]:
+    """Read-only monitor that runs the PChome controlled-apply drift verifier."""
+    engine = None
+    try:
+        from config import DATABASE_PATH
+        from services import pchome_mapping_backlog_service as backlog
+
+        engine = _create_pchome_drift_monitor_engine(DATABASE_PATH)
+        receipt_replay = (
+            backlog.build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_receipt_replay_package(
+                materialize_artifacts=False,
+                engine=engine,
+            )
+        )
+        drift_verifier = (
+            backlog.build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_drift_verifier_package(
+                engine=engine,
+                source_receipt_replay=receipt_replay,
+                materialize_artifacts=False,
+            )
+        )
+        summary = drift_verifier.get("summary") or {}
+        drift_count = int(summary.get("drift_count") or 0)
+        selector_count = int(summary.get("target_selector_count") or 0)
+        pass_count = int(summary.get("post_apply_readback_pass_count") or 0)
+        verified_count = int(summary.get("drift_verified_count") or 0)
+        artifact_count = int(summary.get("drift_verifier_artifact_materialized_count") or 0)
+        artifact_hash_match_count = int(summary.get("drift_verifier_artifact_hash_match_count") or 0)
+        writes_database_count = int(summary.get("writes_database_count") or 0)
+        result = str(drift_verifier.get("result") or "UNKNOWN")
+        receipt_result = str(receipt_replay.get("result") or "UNKNOWN")
+
+        if writes_database_count:
+            status = "critical"
+            summary_text = "PChome drift monitor 偵測到 verifier 有寫 DB 風險"
+        elif drift_count:
+            status = "critical"
+            summary_text = f"PChome controlled apply 偵測到 {drift_count} 筆 drift"
+        elif verified_count and selector_count and pass_count == selector_count:
+            status = "ok"
+            summary_text = f"PChome controlled apply drift 已驗證 {pass_count}/{selector_count}，目前 0 drift"
+        else:
+            status = "warning"
+            summary_text = "PChome controlled apply drift verifier 尚未達例行監控完成條件"
+
+        return _check(
+            "PChome 受控落地 drift monitor",
+            status,
+            summary_text,
+            {
+                "result": result,
+                "source_receipt_replay_result": receipt_result,
+                "selector_count": selector_count,
+                "readback_pass_count": pass_count,
+                "drift_count": drift_count,
+                "drift_verified_count": verified_count,
+                "artifact_count": artifact_count,
+                "artifact_hash_match_count": artifact_hash_match_count,
+                "writes_database_count": writes_database_count,
+                "writes_database": False,
+                "materialize_artifacts": False,
+                "requires_production_version_truth": True,
+            },
+        )
+    except Exception as exc:
+        return _check(
+            "PChome 受控落地 drift monitor",
+            "warning",
+            f"PChome drift verifier 例行監控暫時無法讀取：{exc}",
+            {
+                "writes_database": False,
+                "materialize_artifacts": False,
+                "requires_production_version_truth": True,
+            },
+        )
+    finally:
+        if engine is not None:
+            engine.dispose()
+
+
 def collect_ai_automation_smoke(*, record_history: bool = True, history_limit: int = 20) -> Dict[str, Any]:
    checks: List[Dict[str, Any]] = [
        _event_router_check(),
@@ -486,6 +580,7 @@ def collect_ai_automation_smoke(*, record_history: bool = True, history_limit: i
        _nemotron_check(),
        _embedding_queue_check(),
        _elephant_hitl_check(),
+        _pchome_controlled_apply_drift_monitor_check(),
    ]
    worst = max(checks, key=lambda item: STATUS_RANK.get(item["status"], 2))["status"]
    result = {
--- a/tests/test_ai_automation_smoke_service.py
+++ b/tests/test_ai_automation_smoke_service.py
@@ -32,11 +32,63 @@ def test_collect_ai_automation_smoke_uses_worst_status(monkeypatch):
    monkeypatch.setattr(smoke, "_nemotron_check", lambda: smoke._check("nemotron", "ok", "ok"))
    monkeypatch.setattr(smoke, "_embedding_queue_check", lambda: smoke._check("embedding", "critical", "boom"))
    monkeypatch.setattr(smoke, "_elephant_hitl_check", lambda: smoke._check("elephant", "ok", "ok"))
+    monkeypatch.setattr(smoke, "_pchome_controlled_apply_drift_monitor_check", lambda: smoke._check("pchome", "ok", "ok"))

    result = smoke.collect_ai_automation_smoke(record_history=False)

    assert result["status"] == "critical"
-    assert result["summary"] == {"ok": 4, "warning": 1, "critical": 1, "total": 6}
+    assert result["summary"] == {"ok": 5, "warning": 1, "critical": 1, "total": 7}
+
+
+def test_pchome_controlled_apply_drift_monitor_reports_verified_zero_drift(monkeypatch):
+    from services import ai_automation_smoke_service as smoke
+    from services import pchome_mapping_backlog_service as backlog
+
+    class FakeEngine:
+        disposed = False
+
+        def dispose(self):
+            self.disposed = True
+
+    fake_engine = FakeEngine()
+    monkeypatch.setattr(smoke, "_create_pchome_drift_monitor_engine", lambda _path: fake_engine)
+    monkeypatch.setattr(
+        backlog,
+        "build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_receipt_replay_package",
+        lambda **_kwargs: {
+            "result": "DIRECT_MAPPING_RETRY_EXCEPTION_CONTROLLED_APPLY_RECEIPT_REPLAYED",
+            "summary": {
+                "target_selector_count": 4,
+                "post_apply_readback_pass_count": 4,
+                "executor_receipt_hash_match_count": 1,
+            },
+        },
+    )
+    monkeypatch.setattr(
+        backlog,
+        "build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_drift_verifier_package",
+        lambda **_kwargs: {
+            "result": "DIRECT_MAPPING_RETRY_EXCEPTION_CONTROLLED_APPLY_DRIFT_VERIFIED",
+            "summary": {
+                "target_selector_count": 4,
+                "post_apply_readback_pass_count": 4,
+                "drift_count": 0,
+                "drift_verified_count": 1,
+                "drift_verifier_artifact_materialized_count": 1,
+                "drift_verifier_artifact_hash_match_count": 1,
+                "writes_database_count": 0,
+            },
+        },
+    )
+
+    result = smoke._pchome_controlled_apply_drift_monitor_check()
+
+    assert result["status"] == "ok"
+    assert result["details"]["selector_count"] == 4
+    assert result["details"]["drift_count"] == 0
+    assert result["details"]["writes_database"] is False
+    assert result["details"]["materialize_artifacts"] is False
+    assert fake_engine.disposed is True


 def test_collect_ai_automation_smoke_persists_recent_history(tmp_path, monkeypatch):
@@ -51,6 +103,7 @@ def test_collect_ai_automation_smoke_persists_recent_history(tmp_path, monkeypat
    monkeypatch.setattr(smoke, "_nemotron_check", lambda: smoke._check("nemotron", "ok", "ok"))
    monkeypatch.setattr(smoke, "_embedding_queue_check", lambda: smoke._check("embedding", "ok", "ok"))
    monkeypatch.setattr(smoke, "_elephant_hitl_check", lambda: smoke._check("elephant", "ok", "ok"))
+    monkeypatch.setattr(smoke, "_pchome_controlled_apply_drift_monitor_check", lambda: smoke._check("pchome", "ok", "ok"))

    first = smoke.collect_ai_automation_smoke(history_limit=5)
    second = smoke.collect_ai_automation_smoke(history_limit=5)