diff --git a/docs/AI_INTELLIGENCE_MODULE_SOT.md b/docs/AI_INTELLIGENCE_MODULE_SOT.md index 4946d3e..18e4ea5 100644 --- a/docs/AI_INTELLIGENCE_MODULE_SOT.md +++ b/docs/AI_INTELLIGENCE_MODULE_SOT.md @@ -87,6 +87,7 @@ - 2026-07-02 起 AI automation smoke 必須例行執行 PChome controlled-apply drift monitor;`PChome 受控落地 drift monitor` 會以 read-only 方式重放 receipt replay + drift verifier,將 drift detected 或 verifier write-risk 升為 `critical`,並在 `/api/ai-automation/smoke` 與每日 smoke 摘要中回報 selector/readback/drift/artifact hash 狀態。 - 2026-07-02 起 PChome controlled-apply drift 必須提供 read-only rollback / re-apply recommendation package;`/api/ai/pchome-growth/mapping-backlog/direct-mapping-retry-candidate-exception-controlled-apply-drift-recovery-package` 會輸出 drift recovery actions、controlled re-apply SQL shape、rollback SQL shape、selector bindings、acceptance gates 與 artifact hash verifier。此 package 不執行 SQL、不寫 DB,0 drift 時必須產生 no-op evidence,drift detected 時才輸出 ready_for_controlled_reapply actions。 - 2026-07-02 起 PChome controlled-apply 必須提供 compact latest readback endpoint;`/api/ai/pchome-growth/mapping-backlog/direct-mapping-retry-candidate-exception-controlled-apply-compact-readback-package` 會收斂 apply、receipt replay、drift verifier、drift recovery 四段 receipt,輸出 product status、next machine action、selector readback、drift count、recovery action count 與 artifact hash 狀態。此 endpoint 是後續產品 UI 的主要資料來源,不執行 SQL、不寫 DB。 +- 2026-07-02 起 PChome controlled-apply artifacts 必須提供 read-only retention policy;`/api/ai/pchome-growth/mapping-backlog/direct-mapping-retry-candidate-exception-controlled-apply-artifact-retention-package` 會掃描 verifier inputs、identity readback、controlled apply preflight、executor、replay、drift verifier、drift recovery、compact readback 八類 artifacts,依 `keep_latest_per_family` 保留最新 evidence 並保護 active compact readback chain,只輸出 prune candidates 與 retention receipt,不直接刪檔、不寫 DB、不執行 destructive prune。 - V10.644 起 `/ai_intelligence` 的商品明細列不得只用句子描述比價;每列必須顯示 PChome 價格、MOMO 參考價、差距、可信度四格價格證據,並保留下一步按鈕。單位價候選需顯示單位價與單位,候選待確認或缺資料則以「待補 / 候選待確認」呈現,不得捏造價格。 - V10.645 起 `/ai_intelligence` 的商品明細分流切換後,必須顯示「這類商品怎麼處理」的行動摘要,包含件數、近 7 天業績、平均可信度、最大價差、代表商品與主按鈕;使用者不得只能看到商品列表而不知道下一步。 - V10.646 起 `/ai_intelligence` 的商品明細必須提供搜尋與排序;搜尋至少涵蓋商品、分類、商品編號與 MOMO 候選資訊,排序至少支援優先級、近 7 天業績、價差、下滑幅度與可信度。搜尋/排序後的行動摘要與明細列表必須使用同一批結果。 diff --git a/docs/guides/pchome_ai_automation_priority_backlog.md b/docs/guides/pchome_ai_automation_priority_backlog.md index 90cbb08..ce7081c 100644 --- a/docs/guides/pchome_ai_automation_priority_backlog.md +++ b/docs/guides/pchome_ai_automation_priority_backlog.md @@ -74,6 +74,10 @@ - `direct-mapping-retry-candidate-exception-controlled-apply-compact-readback-package` 會回傳 apply、replay、drift、recovery 四段 compact receipt - 產品面可直接讀取 `product_status`、`next_machine_action`、selector readback、drift count 與 artifact hash 狀態 - compact readback 自身也可 materialize artifact 並驗證 hash +- Controlled-apply artifact retention policy 已完成: + - `direct-mapping-retry-candidate-exception-controlled-apply-artifact-retention-package` 會掃描 controlled apply 相關 artifact families + - 每個 family 依 `keep_latest_per_family` 保留最新 evidence,並額外保護 active compact readback chain + - 只產生 prune candidates 與 retention receipt,不直接刪檔、不寫 DB、不執行 destructive prune - AI debt scanner 顯示產品面清空: - `PRODUCT_SURFACE_CLEAR` - `finding_count=0` @@ -81,7 +85,7 @@ 進行中 / 下一步,必須照順序: -1. 建立 PChome controlled-apply artifacts retention policy,讓 evidence 可追蹤但不無限制膨脹。 +1. 把 PChome AI automation lanes 與 compact readback 接進 product dashboard 第一視窗。 完成標準: @@ -196,8 +200,8 @@ | P0.7 | Automated drift monitor | 已完成 | smoke check `PChome 受控落地 drift monitor` | 納入每日 smoke 與 runtime readback | | P0.8 | Drift rollback / re-apply package | 已完成 | drift recovery package route + focused tests | 接入 compact readback | | P0.9 | Compact latest apply / replay / drift / recovery readback endpoint | 已完成 | compact readback route + focused tests | 接入 product dashboard first viewport | -| P0.10 | Controlled-apply artifact retention policy | 未開始 | compact artifacts exist | 下一個實作 | -| P1.1 | Dashboard AI automation first-viewport surface | 未開始 | API readiness + compact readback exist | P0 retention policy 後實作 | +| P0.10 | Controlled-apply artifact retention policy | 已完成 | retention policy route + focused tests | 接入 product dashboard first viewport | +| P1.1 | Dashboard AI automation first-viewport surface | 未開始 | API readiness + compact readback + retention policy exist | 下一個實作 | | P1.2 | UI wording guard for no raw engineering terms | 未開始 | existing guardrails only | 為新 automation surface 補 tests | | P2.1 | External benchmark encoded into requirements | 未開始 | benchmark guide exists | 更新 guardrails / tests | | P3.1 | Extend receipt / replay / drift pattern to more lanes | 未開始 | current retry lane complete | P1 後選下一條 safe lane | diff --git a/routes/ai_routes.py b/routes/ai_routes.py index 712acc0..aa55476 100644 --- a/routes/ai_routes.py +++ b/routes/ai_routes.py @@ -2617,6 +2617,43 @@ def api_pchome_growth_direct_mapping_retry_candidate_exception_controlled_apply_ }), 500 +@ai_bp.route('/api/ai/pchome-growth/mapping-backlog/direct-mapping-retry-candidate-exception-controlled-apply-artifact-retention-package') +@login_required +def api_pchome_growth_direct_mapping_retry_candidate_exception_controlled_apply_artifact_retention_package(): + """P2 read-only retention policy for controlled-apply artifact families.""" + try: + from config import DATABASE_PATH + from services.pchome_mapping_backlog_service import ( + build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_artifact_retention_package, + ) + + run_id = str(request.args.get('run_id') or '').strip() or None + keep_latest = request.args.get('keep_latest_per_family', 3, type=int) + keep_latest = max(1, min(keep_latest or 3, 20)) + materialize_artifacts = str(request.args.get('materialize_artifacts') or '').strip().lower() in {'1', 'true', 'yes'} + + engine = _create_icaim_dashboard_engine(DATABASE_PATH) + try: + package = build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_artifact_retention_package( + run_id=run_id, + keep_latest_per_family=keep_latest, + materialize_artifacts=materialize_artifacts, + engine=engine, + ) + finally: + engine.dispose() + package["source_endpoint"] = ( + "/api/ai/pchome-growth/mapping-backlog/direct-mapping-retry-candidate-exception-controlled-apply-compact-readback-package" + ) + return jsonify(package) + except Exception as exc: + logger.error("[PChomeGrowth] direct mapping retry candidate exception controlled apply artifact retention 讀取失敗: %s", exc, exc_info=True) + return jsonify({ + "success": False, + "error": "PChome 商品對應 retry 例外 controlled apply artifact retention 暫時無法讀取,請稍後再試。", + }), 500 + + @ai_bp.route('/api/ai/pchome-growth/ai-automation-readiness') @login_required def api_pchome_growth_ai_automation_readiness(): diff --git a/services/pchome_mapping_backlog_service.py b/services/pchome_mapping_backlog_service.py index fe90f5c..546a226 100644 --- a/services/pchome_mapping_backlog_service.py +++ b/services/pchome_mapping_backlog_service.py @@ -83,6 +83,9 @@ DIRECT_MAPPING_RETRY_CANDIDATE_EXCEPTION_CONTROLLED_APPLY_DRIFT_RECOVERY_POLICY DIRECT_MAPPING_RETRY_CANDIDATE_EXCEPTION_CONTROLLED_APPLY_COMPACT_READBACK_POLICY = ( "read_only_pchome_growth_direct_mapping_retry_candidate_exception_controlled_apply_compact_readback" ) +DIRECT_MAPPING_RETRY_CANDIDATE_EXCEPTION_CONTROLLED_APPLY_ARTIFACT_RETENTION_POLICY = ( + "read_only_pchome_growth_direct_mapping_retry_candidate_exception_controlled_apply_artifact_retention" +) AI_AUTOMATION_READINESS_POLICY = "read_only_pchome_growth_ai_automation_readiness" EVIDENCE_ENRICHMENT_PREVIEW_POLICY = "read_only_pchome_growth_evidence_enrichment_preview" EVIDENCE_SOURCE_PREVIEW_POLICY = "read_only_pchome_growth_evidence_source_preview" @@ -5284,6 +5287,296 @@ def build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_compa } +_CONTROLLED_APPLY_RETENTION_FAMILIES = [ + { + "family": "verifier_inputs", + "artifact_key": "retry_exception_closeout_verifier_input_artifact", + "subdir": "verifier_inputs", + }, + { + "family": "identity_readback", + "artifact_key": "retry_exception_closeout_identity_readback_artifact", + "subdir": "identity_readback", + }, + { + "family": "controlled_apply_preflight", + "artifact_key": "retry_exception_controlled_apply_preflight_artifact", + "subdir": "controlled_apply_preflight", + }, + { + "family": "controlled_apply_executor", + "artifact_key": "retry_exception_controlled_apply_executor_receipt", + "subdir": "controlled_apply_executor", + }, + { + "family": "controlled_apply_executor_replay", + "artifact_key": "retry_exception_controlled_apply_executor_replay_receipt", + "subdir": "controlled_apply_executor_replay", + }, + { + "family": "controlled_apply_drift_verifier", + "artifact_key": "retry_exception_controlled_apply_drift_verifier_receipt", + "subdir": "controlled_apply_drift_verifier", + }, + { + "family": "controlled_apply_drift_recovery", + "artifact_key": "retry_exception_controlled_apply_drift_recovery_receipt", + "subdir": "controlled_apply_drift_recovery", + }, + { + "family": "controlled_apply_compact_readback", + "artifact_key": "retry_exception_controlled_apply_compact_readback_receipt", + "subdir": "controlled_apply_compact_readback", + }, +] + + +def _retry_exception_artifact_retention_id(summary: dict[str, Any], protected_paths: list[str]) -> str: + payload = {"summary": summary, "protected_paths": protected_paths} + digest = hashlib.sha256( + json.dumps(payload, ensure_ascii=False, sort_keys=True, default=str).encode("utf-8") + ).hexdigest()[:16] + return f"pchome-retry-exception-controlled-apply-artifact-retention-{digest}" + + +def _scan_retry_exception_retention_family( + root: Path, + family: dict[str, str], + *, + keep_latest_per_family: int, + protected_relative_paths: set[str], +) -> dict[str, Any]: + artifact_dir = root / "artifacts" / "pchome_growth" / "retry_exception_closeout" / family["subdir"] + paths = sorted( + artifact_dir.glob("*.json") if artifact_dir.exists() else [], + key=lambda path: path.stat().st_mtime, + reverse=True, + ) + artifacts: list[dict[str, Any]] = [] + keep_count = 0 + prune_candidate_count = 0 + total_bytes = 0 + prune_candidate_bytes = 0 + for index, path in enumerate(paths, start=1): + relative_path = str(path.relative_to(root)) + byte_count = path.stat().st_size + total_bytes += byte_count + protected = index <= keep_latest_per_family or relative_path in protected_relative_paths + sha = hashlib.sha256(path.read_bytes()).hexdigest() + decision = "keep" if protected else "candidate_for_retention_prune" + if protected: + keep_count += 1 + else: + prune_candidate_count += 1 + prune_candidate_bytes += byte_count + artifacts.append({ + "family": family["family"], + "artifact_key": family["artifact_key"], + "relative_path": relative_path, + "payload_sha256": sha, + "byte_count": byte_count, + "latest_rank": index, + "protected_by_latest_window": index <= keep_latest_per_family, + "protected_by_active_chain": relative_path in protected_relative_paths, + "retention_decision": decision, + "delete_in_package": False, + "writes_database": False, + }) + return { + "family": family["family"], + "artifact_key": family["artifact_key"], + "subdir": family["subdir"], + "artifact_count": len(artifacts), + "keep_count": keep_count, + "prune_candidate_count": prune_candidate_count, + "total_byte_count": total_bytes, + "prune_candidate_byte_count": prune_candidate_bytes, + "artifacts": artifacts, + } + + +def _compact_readback_protected_relative_paths(compact_readback: dict[str, Any]) -> set[str]: + protected_paths: set[str] = set() + for receipt in (compact_readback.get("receipts") or {}).values(): + relative_path = receipt.get("relative_path") + if relative_path: + protected_paths.add(str(relative_path)) + compact_artifact = compact_readback.get("compact_artifact") or {} + if compact_artifact.get("relative_path"): + protected_paths.add(str(compact_artifact["relative_path"])) + return protected_paths + + +def build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_artifact_retention_package( + *, + artifact_root: str | Path | None = None, + run_id: str | None = None, + engine: Any = None, + source_compact_readback: dict[str, Any] | None = None, + keep_latest_per_family: int = 3, + materialize_artifacts: bool = False, +) -> dict[str, Any]: + """Build a no-delete retention policy package for controlled-apply artifacts.""" + root = Path(artifact_root) if artifact_root is not None else Path.cwd() / "data" + keep_latest = max(1, int(keep_latest_per_family or 3)) + compact_readback = source_compact_readback or build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_compact_readback_package( + artifact_root=root, + run_id=run_id, + engine=engine, + materialize_artifacts=False, + ) + protected_paths = _compact_readback_protected_relative_paths(compact_readback) + family_reports = [ + _scan_retry_exception_retention_family( + root, + family, + keep_latest_per_family=keep_latest, + protected_relative_paths=protected_paths, + ) + for family in _CONTROLLED_APPLY_RETENTION_FAMILIES + ] + artifact_count = sum(int(report.get("artifact_count") or 0) for report in family_reports) + keep_count = sum(int(report.get("keep_count") or 0) for report in family_reports) + prune_candidate_count = sum(int(report.get("prune_candidate_count") or 0) for report in family_reports) + total_bytes = sum(int(report.get("total_byte_count") or 0) for report in family_reports) + prune_candidate_bytes = sum(int(report.get("prune_candidate_byte_count") or 0) for report in family_reports) + protected_path_list = sorted(protected_paths) + summary = { + "retention_family_count": len(family_reports), + "artifact_count": artifact_count, + "retained_artifact_count": keep_count, + "prune_candidate_count": prune_candidate_count, + "total_byte_count": total_bytes, + "prune_candidate_byte_count": prune_candidate_bytes, + "keep_latest_per_family": keep_latest, + "protected_active_chain_count": len(protected_path_list), + "retention_prune_executes_count": 0, + "retention_artifact_materialized_count": 0, + "retention_artifact_hash_match_count": 0, + "writes_database_count": 0, + } + result = ( + "DIRECT_MAPPING_RETRY_EXCEPTION_CONTROLLED_APPLY_ARTIFACT_RETENTION_POLICY_READY" + if artifact_count + else "WAITING_FOR_RETRY_EXCEPTION_CONTROLLED_APPLY_ARTIFACT_RETENTION_INPUTS" + ) + retention_id = _retry_exception_artifact_retention_id(summary, protected_path_list) + safety = { + "ai_controlled_apply": True, + "artifact_retention": True, + "reads_artifact_files": True, + "reads_database": engine is not None, + "deletes_artifacts": False, + "retention_prune_executes": False, + "writes_database": False, + "writes_database_count": 0, + "writes_artifact_count": 0, + "syncs_external_offers": False, + "dispatches_telegram": False, + "gemini_allowed": False, + "requires_production_version_truth": True, + } + checks = [ + {"check": "compact_readback_loaded", "passed": bool(compact_readback)}, + {"check": "retention_families_scanned", "passed": len(family_reports) == len(_CONTROLLED_APPLY_RETENTION_FAMILIES)}, + {"check": "active_chain_paths_protected", "passed": bool(protected_path_list)}, + {"check": "retention_policy_does_not_delete_artifacts", "passed": True}, + {"check": "retention_policy_does_not_write_database", "passed": True}, + ] + artifact_payload = { + "artifact_key": "retry_exception_controlled_apply_artifact_retention_policy_receipt", + "retention_id": retention_id, + "source_policy": DIRECT_MAPPING_RETRY_CANDIDATE_EXCEPTION_CONTROLLED_APPLY_ARTIFACT_RETENTION_POLICY, + "source_compact_readback_result": compact_readback.get("result"), + "result": result, + "summary": summary, + "protected_active_chain_paths": protected_path_list, + "family_reports": family_reports, + "checks": checks, + "safety": safety, + } + artifact_bytes = _canonical_retry_exception_artifact_bytes(artifact_payload) + artifact_relative_path = ( + f"artifacts/pchome_growth/retry_exception_closeout/" + f"controlled_apply_artifact_retention/{retention_id}.json" + ) + retention_artifact = { + "key": "retry_exception_controlled_apply_artifact_retention_policy_receipt", + "artifact_type": "controlled_apply_artifact_retention_policy_receipt", + "relative_path": artifact_relative_path, + "payload_sha256": hashlib.sha256(artifact_bytes).hexdigest(), + "byte_count": len(artifact_bytes), + "payload": artifact_payload, + "materialized": False, + "writes_database": False, + } + materialized_retention_artifacts: list[dict[str, Any]] = [] + if materialize_artifacts and artifact_count: + target_path = _resolve_retry_exception_artifact_path(root, artifact_relative_path) + target_path.parent.mkdir(parents=True, exist_ok=True) + target_path.write_bytes(artifact_bytes) + materialized_retention_artifacts.append({ + "key": retention_artifact["key"], + "relative_path": artifact_relative_path, + "absolute_path": str(target_path), + "payload_sha256": retention_artifact["payload_sha256"], + "written_byte_count": target_path.stat().st_size, + "writes_database": False, + }) + retention_artifact["materialized"] = True + retention_artifact["absolute_path"] = str(target_path) + artifact_path = _resolve_retry_exception_artifact_path(root, artifact_relative_path) + artifact_sha = hashlib.sha256(artifact_path.read_bytes()).hexdigest() if artifact_path.exists() else "" + artifact_hash_match = bool(artifact_sha) and artifact_sha == retention_artifact["payload_sha256"] + summary["retention_artifact_materialized_count"] = len(materialized_retention_artifacts) or (1 if artifact_hash_match else 0) + summary["retention_artifact_hash_match_count"] = 1 if artifact_hash_match else 0 + safety["writes_artifact_count"] = len(materialized_retention_artifacts) + checks.extend([ + { + "check": "retention_artifact_materialized_when_requested", + "passed": (not materialize_artifacts) or (artifact_count > 0 and artifact_path.exists()), + }, + { + "check": "retention_artifact_hash_matches_expected", + "passed": (not materialize_artifacts) or artifact_hash_match, + }, + ]) + return { + "policy": DIRECT_MAPPING_RETRY_CANDIDATE_EXCEPTION_CONTROLLED_APPLY_ARTIFACT_RETENTION_POLICY, + "result": result, + "success": artifact_count > 0, + "summary": summary, + "artifact_retention": { + "retention_id": retention_id, + "stage": "P2_retry_exception_controlled_apply_artifact_retention", + "status": "ready" if artifact_count else "waiting", + "keep_latest_per_family": keep_latest, + "protected_active_chain_count": len(protected_path_list), + "materialize_artifacts": bool(materialize_artifacts), + "requires_production_version_truth": True, + }, + "protected_active_chain_paths": protected_path_list, + "family_reports": family_reports, + "retention_artifact": retention_artifact, + "materialized_retention_artifacts": materialized_retention_artifacts, + "post_retention_artifact_verifier": { + "expected_sha256": retention_artifact["payload_sha256"], + "actual_sha256": artifact_sha, + "hash_match": artifact_hash_match, + "writes_database": False, + }, + "checks": checks, + "check_count": len(checks), + "all_checks_passed": all(check.get("passed") is True for check in checks), + "next_actions": [ + "Use prune candidates only after a separate controlled delete executor is added.", + "Keep the latest active compact readback chain protected before any artifact pruning.", + "Expose retained/prune candidate counts on the product dashboard before enabling prune execution.", + ], + "safety": safety, + } + + def build_pchome_evidence_enrichment_preview(payload: dict[str, Any], batch_size: int = 5) -> dict[str, Any]: """Build a read-only evidence enrichment package for mapping targets.""" operator_preview = build_pchome_mapping_operator_preview(payload, batch_size=batch_size) diff --git a/tests/test_pchome_mapping_backlog_report.py b/tests/test_pchome_mapping_backlog_report.py index 35ce31f..936a152 100644 --- a/tests/test_pchome_mapping_backlog_report.py +++ b/tests/test_pchome_mapping_backlog_report.py @@ -76,6 +76,7 @@ from services.pchome_mapping_backlog_service import ( build_pchome_direct_mapping_candidate_exception_resolution_closeout_package, build_pchome_direct_mapping_retry_candidate_decision_package, build_pchome_direct_mapping_retry_candidate_exception_auto_resolution_package, + build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_artifact_retention_package, build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_compact_readback_package, build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_drift_recovery_package, build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_drift_verifier_package, @@ -1528,6 +1529,40 @@ def test_direct_mapping_retry_candidate_exception_controlled_apply_receipt_repla assert drift_compact["summary"]["compact_readback_artifact_materialized_count"] == 1 assert drift_compact["post_compact_artifact_verifier"]["hash_match"] is True assert drift_compact["safety"]["writes_database"] is False + retention = build_pchome_direct_mapping_retry_candidate_exception_controlled_apply_artifact_retention_package( + artifact_root=tmp_path, + run_id=run_id, + engine=engine, + source_compact_readback=drift_compact, + keep_latest_per_family=1, + materialize_artifacts=True, + ) + assert retention["result"] == "DIRECT_MAPPING_RETRY_EXCEPTION_CONTROLLED_APPLY_ARTIFACT_RETENTION_POLICY_READY" + assert retention["summary"]["retention_family_count"] == 8 + assert retention["summary"]["artifact_count"] >= 8 + assert retention["summary"]["retained_artifact_count"] > 0 + assert retention["summary"]["prune_candidate_count"] >= 1 + assert retention["summary"]["artifact_count"] == ( + retention["summary"]["retained_artifact_count"] + + retention["summary"]["prune_candidate_count"] + ) + assert retention["summary"]["retention_prune_executes_count"] == 0 + assert retention["summary"]["retention_artifact_materialized_count"] == 1 + assert retention["summary"]["retention_artifact_hash_match_count"] == 1 + assert retention["post_retention_artifact_verifier"]["hash_match"] is True + assert retention["safety"]["deletes_artifacts"] is False + assert retention["safety"]["retention_prune_executes"] is False + assert retention["safety"]["writes_database"] is False + protected_paths = set(retention["protected_active_chain_paths"]) + assert drift_compact["compact_artifact"]["relative_path"] in protected_paths + prune_candidates = [ + artifact + for family in retention["family_reports"] + for artifact in family["artifacts"] + if artifact["retention_decision"] == "candidate_for_retention_prune" + ] + assert prune_candidates + assert all(candidate["delete_in_package"] is False for candidate in prune_candidates) assert call_count["search"] == 2