feat(runtime): harden daily pipeline audit and verification

Tighten real-ingestion success rules, separate scheduled reports from historical rebuilds, and persist source-level runtime audit across daily pipeline runs. Also add the Phase 5 CI workflow contract plus verification updates and supporting docs so the full uncommitted change set can be validated together.
2026-05-14 16:17:39 +08:00
parent 618dff33da
commit a8999abcb0
17 changed files with 880 additions and 45 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -0,0 +1,48 @@
+name: CI
+
+on:
+  push:
+    branches:
+      - main
+  pull_request:
+
+jobs:
+  go-test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-go@v5
+        with:
+          go-version: "1.22"
+          cache: true
+
+      - name: Run Go tests
+        run: go test ./...
+
+  frontend-build:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-node@v4
+        with:
+          node-version: "20"
+          cache: npm
+          cache-dependency-path: frontend/package-lock.json
+
+      - name: Install frontend dependencies
+        working-directory: frontend
+        run: npm ci
+
+      - name: Build frontend
+        working-directory: frontend
+        run: npm run build
+
+  docker-build:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Build container image
+        run: docker build -t llm-intelligence:ci .
--- a/cmd/server/main.go
+++ b/cmd/server/main.go
@@ -253,6 +253,8 @@ func fetchLatestReport(ctx context.Context, db *sql.DB) (*latestReportResponse,
 		FROM daily_report
 		WHERE output_path IS NOT NULL
 		  AND output_path <> ''
+		  AND status = 'generated'
+		  AND COALESCE(is_official_daily, true) = true
 		ORDER BY report_date DESC, updated_at DESC
 		LIMIT 1
 	`).Scan(
--- a/db/migrations/007_report_run_audit_semantics.sql
+++ b/db/migrations/007_report_run_audit_semantics.sql
@@ -0,0 +1,51 @@
+-- 区分正式日报、手工运行与历史重建的运行语义
+
+DO $$
+BEGIN
+    IF NOT EXISTS (
+        SELECT 1 FROM information_schema.columns
+        WHERE table_name = 'daily_report' AND column_name = 'run_kind'
+    ) THEN
+        ALTER TABLE daily_report ADD COLUMN run_kind TEXT NOT NULL DEFAULT 'scheduled';
+    END IF;
+
+    IF NOT EXISTS (
+        SELECT 1 FROM information_schema.columns
+        WHERE table_name = 'daily_report' AND column_name = 'trigger_source'
+    ) THEN
+        ALTER TABLE daily_report ADD COLUMN trigger_source TEXT NOT NULL DEFAULT 'legacy_backfill';
+    END IF;
+
+    IF NOT EXISTS (
+        SELECT 1 FROM information_schema.columns
+        WHERE table_name = 'daily_report' AND column_name = 'is_official_daily'
+    ) THEN
+        ALTER TABLE daily_report ADD COLUMN is_official_daily BOOLEAN NOT NULL DEFAULT TRUE;
+    END IF;
+
+    IF NOT EXISTS (
+        SELECT 1 FROM information_schema.columns
+        WHERE table_name = 'report_runs' AND column_name = 'run_kind'
+    ) THEN
+        ALTER TABLE report_runs ADD COLUMN run_kind TEXT NOT NULL DEFAULT 'unknown';
+    END IF;
+
+    IF NOT EXISTS (
+        SELECT 1 FROM information_schema.columns
+        WHERE table_name = 'report_runs' AND column_name = 'trigger_source'
+    ) THEN
+        ALTER TABLE report_runs ADD COLUMN trigger_source TEXT NOT NULL DEFAULT 'legacy_backfill';
+    END IF;
+
+    IF NOT EXISTS (
+        SELECT 1 FROM information_schema.columns
+        WHERE table_name = 'report_runs' AND column_name = 'is_official_daily'
+    ) THEN
+        ALTER TABLE report_runs ADD COLUMN is_official_daily BOOLEAN NOT NULL DEFAULT FALSE;
+    END IF;
+END $$;
+
+CREATE INDEX IF NOT EXISTS idx_daily_report_official_daily ON daily_report(is_official_daily);
+CREATE INDEX IF NOT EXISTS idx_daily_report_run_kind ON daily_report(run_kind);
+CREATE INDEX IF NOT EXISTS idx_report_runs_run_kind ON report_runs(run_kind);
+CREATE INDEX IF NOT EXISTS idx_report_runs_official_daily ON report_runs(is_official_daily);
--- a/docs/plans/2026-05-14-runtime-trust-gap-remediation-plan.md
+++ b/docs/plans/2026-05-14-runtime-trust-gap-remediation-plan.md
@@ -0,0 +1,189 @@
+# Runtime Trust Gap Remediation Plan
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** 系统性修复日报与采集链路中影响真实性和长期可信度的 3 个缺口，确保“每日定时产出”的结果来自真实采集、可审计运行、并覆盖多源数据链路。
+
+**Architecture:** 不推翻现有 Phase 1/Phase 2 设计，只在运行语义和审计层补强。将“采集是否真实成功”“这次运行是否为正式日常产出/历史重建”“多源数据是否进入定时链路”拆成独立状态，并让 `run_daily.sh`、日报生成器、验证脚本、数据库记录统一使用同一套运行语义。优先修复最容易掩盖真实失败的宽松成功判定，再修复审计分流，最后把多源采集纳入自动调度。
+
+**Tech Stack:** Bash、Go 1.22、PostgreSQL、cron、html/template
+
+---
+
+### Task 1: 收紧“采集成功”判定，避免 mock / 写库失败被伪装成成功
+
+**Files:**
+- Modify: `scripts/fetch_openrouter.go`
+- Modify: `scripts/run_daily.sh`
+- Modify: `scripts/run_real_pipeline.sh`
+- Modify: `scripts/verify_phase3.sh`
+- Test: `scripts/fetch_openrouter_test.go`
+- Test: `scripts/run_daily` 对应 shell 验证（可先用现有 verify 脚本）
+
+**Step 1: 写失败测试**
+
+补 3 个失败场景：
+- 没有 `OPENROUTER_API_KEY` 时，调度链不应被当作真实采集成功
+- `summarizeDB` 写库失败时，`fetch_openrouter` 在“真实模式”下应返回非 0
+- `run_daily.sh` 不能仅凭“数据库里已有旧数据”就通过质量检查
+
+**Step 2: 跑测试确认当前行为过宽**
+
+Run:
+- `go test -tags llm_script scripts/fetch_openrouter.go scripts/fetch_openrouter_test.go`
+- `bash scripts/verify_phase3.sh`
+
+Expected:
+- 能看到 mock / 降级 / 旧数据掩盖真实失败的风险暴露出来
+
+**Step 3: 最小实现**
+
+建议分两层收紧：
+- `fetch_openrouter.go` 增加严格模式或显式运行模式，真实调度默认要求数据库写入成功，否则退出非 0
+- `run_daily.sh` 在质量检查中引入“本次运行必须产生当天的写入痕迹”而不是只看历史总量
+- `run_real_pipeline.sh` 明确只把“真实采集 + 真实写库 + 真实日报生成”视为成功
+
+**Step 4: 重新运行验证**
+
+Run:
+- `bash scripts/run_daily.sh`
+- `bash scripts/run_real_pipeline.sh`
+- `bash scripts/verify_phase3.sh`
+
+Expected:
+- 真实失败会真正失败
+- mock / 仅写 JSON / 旧数据不会再伪装成已完成
+
+**Step 5: Commit**
+
+```bash
+git add scripts/fetch_openrouter.go scripts/run_daily.sh scripts/run_real_pipeline.sh scripts/verify_phase3.sh scripts/fetch_openrouter_test.go
+git commit -m "fix(runtime): harden daily ingestion success checks"
+```
+
+### Task 2: 将正式日报与历史重建分流到不同运行语义，修复审计混写
+
+**Files:**
+- Modify: `scripts/generate_daily_report.go`
+- Modify: `scripts/rebuild_historical_report.sh`
+- Modify: `scripts/report_utils.sh`
+- Modify: `scripts/run_daily.sh`
+- Modify: `scripts/run_real_pipeline.sh`
+- Modify: `scripts/verify_phase3.sh`
+- Test: `scripts/generate_daily_report_test.go`
+
+**Step 1: 写失败测试**
+
+补测试验证：
+- 正式日常产出与历史重建会写入不同的运行类型
+- 历史重建不应冒充“每日定时产出”
+- `fetchLatestReport` 与前端最新日报读取仍然只面向正式产出口径
+
+**Step 2: 跑测试确认当前混写**
+
+Run:
+- `go test -tags llm_script scripts/generate_daily_report.go scripts/generate_daily_report_test.go`
+
+Expected:
+- 当前 `daily_report` / `report_runs` 的运行语义仍不区分正式与重建
+
+**Step 3: 最小实现**
+
+建议新增并统一以下语义字段：
+- `run_kind`: `scheduled` / `historical_rebuild` / `manual`
+- `trigger_source`: `cron` / `cli` / `rebuild_script`
+- `is_official_daily`: 是否属于当天定时正式产出
+
+落点建议：
+- `generate_daily_report.go` 的数据库写入携带运行类型
+- `rebuild_historical_report.sh` 强制标记历史重建语义
+- 前端和 API 默认只读取正式产出作为“最新日报”
+
+**Step 4: 重新运行验证**
+
+Run:
+- `go test ./...`
+- `bash scripts/rebuild_historical_report.sh 2025-08-07`
+- `bash scripts/run_daily.sh`
+
+Expected:
+- 历史重建和日常产出可以共存，但不会再在审计层混为一类
+
+**Step 5: Commit**
+
+```bash
+git add scripts/generate_daily_report.go scripts/rebuild_historical_report.sh scripts/report_utils.sh scripts/run_daily.sh scripts/run_real_pipeline.sh scripts/verify_phase3.sh scripts/generate_daily_report_test.go
+git commit -m "feat(audit): separate scheduled and rebuild report runs"
+```
+
+### Task 3: 把多源数据纳入同一条每日自动调度链
+
+**Files:**
+- Modify: `scripts/run_daily.sh`
+- Modify: `scripts/run_real_pipeline.sh`
+- Modify: `scripts/fetch_multi_source.go`
+- Create or Modify: `scripts/fetch_multi_source_test.go`
+- Modify: `scripts/verify_phase3.sh`
+- Modify: `scripts/verify_phase5.sh`
+- 视需要修改：`scripts/import_phase2_data.go`、`scripts/import_zhipu_data.go`、`scripts/import_bytedance_data.go`
+
+**Step 1: 写失败测试**
+
+补测试验证：
+- 调度链能明确知道哪些来源参与了当日同步
+- 至少 OpenRouter、国内厂商、聚合平台的每日同步在验证层可被看见
+
+**Step 2: 设计最小调度编排**
+
+建议把每日调度拆成可枚举阶段：
+- `openrouter`
+- `multi_source`
+- `official_imports`
+- `daily_report`
+
+并定义每个阶段的失败策略：
+- 任一必需来源失败时，日报应标记为降级/失败，不应伪装成完全成功
+- 允许某些官方导入在单源失败时继续，但必须在运行记录中留下来源级失败痕迹
+
+**Step 3: 最小实现**
+
+优先级建议：
+- 先把 `fetch_multi_source.go` 接入每日调度
+- 再把已有官方导入脚本接入可选的日常补充同步阶段
+- 最后统一审计输出，让 `report_runs` 能显示本次触发的来源集合和失败来源集合
+
+**Step 4: 重新运行验证**
+
+Run:
+- `go test -tags llm_script scripts/fetch_multi_source.go scripts/fetch_multi_source_test.go`
+- `bash scripts/run_daily.sh`
+- `bash scripts/verify_phase3.sh`
+- `bash scripts/verify_phase5.sh`
+
+Expected:
+- 每日调度不再只证明 OpenRouter 独立刷新
+- 多源同步在调度和验收层都能被识别
+
+**Step 5: Commit**
+
+```bash
+git add scripts/run_daily.sh scripts/run_real_pipeline.sh scripts/fetch_multi_source.go scripts/fetch_multi_source_test.go scripts/verify_phase3.sh scripts/verify_phase5.sh
+git commit -m "feat(runtime): fold multi-source sync into daily pipeline"
+```
+
+---
+
+### 执行顺序建议
+
+1. 先做 **Task 1**，因为这是最容易把“假成功”伪装成“真成功”的问题，风险最高。
+2. 再做 **Task 2**，把正式日报与历史重建的审计边界切开。
+3. 最后做 **Task 3**，把多源同步真正纳入每日调度链。
+
+### 验收顺序建议
+
+1. `bash scripts/run_daily.sh`
+2. `bash scripts/rebuild_historical_report.sh <date>`
+3. `bash scripts/verify_phase3.sh`
+4. `bash scripts/verify_phase5.sh`
+5. `go test ./...`
+
--- a/reports/openclaw/OPENCLAW_CAPABILITY_BACKLOG.md
+++ b/reports/openclaw/OPENCLAW_CAPABILITY_BACKLOG.md
@@ -10,7 +10,7 @@

 ---

-## 当前未修复问题速查表（截至 2026-05-13 09:30）
+## 当前未修复问题速查表（截至 2026-05-14 15:10）

 | # | 问题 | 优先级 | 首次暴露 | 修复状态 | 影响次数 |
 |---|------|--------|----------|----------|----------|
@@ -24,25 +24,79 @@
 | 8 | 文件修改后未触发 commit 提示 | P2→P1 | 05-08 09:05 | ❌ 未修复 | 12 次 |
 | 9 | cron review 无 delta 时空转 | P1 | 05-08 09:12 | ❌ 未修复 | 12 次 |
 | 10 | 验证模式伪进展（artifact_present 局限） | P1 | 05-08 14:30 | ❌ 未修复 | 9 次 |
-| 11 | **项目提交停滞（commit stagnation）** | **P0** | **05-08 21:30** | **❌ 未修复（最新仍停留 05-08 commit）** | **12 次** |
+| 11 | **项目提交停滞（commit stagnation）** | **P0** | **05-08 21:30** | **❌ 未修复（虽有新 commit，但工作区长期非干净问题仍在）** | **13 次** |
 | 12 | review 报告未触发修复动作 | P2→P1 | 05-08 21:30 | ❌ 未修复 | 9 次 |
 | 13 | BACKLOG 文件膨胀导致 review 成本递增 | P1 | 05-09 09:30 | ⚠️ 部分（已实施分层归档，但主文件仍在增长） | 7 次 |
-| 14 | **untracked 核心代码未入版本控制** | **P0** | **05-10 21:30** | **❌ 未修复（本轮仍大量 untracked）** | **7 次** |
-| 15 | **CI 配置存在但未验证运行** | **P1** | **05-10 21:30** | **❌ 未修复（仍仅 artifact-present）** | **7 次** |
+| 14 | **untracked 核心代码未入版本控制** | **P0** | **05-10 21:30** | **❌ 未修复（本轮仍有新 untracked 文件）** | **8 次** |
+| 15 | **CI 配置存在但未验证运行** | **P1** | **05-10 21:30** | **❌ 未修复（且当前已暴露出更前置的 CI 文件缺失）** | **8 次** |
 | 16 | **Phase 6+ 范围未定义** | **P1** | **05-10 21:30** | **❌ 未修复** | **5 次** |
 | 17 | collection_stats vs collector_stats 表名不一致 | P2 | 05-11 09:30 | ✅ **已澄清为误报**（05-11 14:30 确认 verify_phase2.sh 与 schema 一致） | 1 次 |
 | 18 | **无 .gitignore 文件** | **P1** | **05-11 14:30** | **❌ 未修复** | **3 次** |
-| 19 | **review 误报传播** | **P1** | **05-11 14:30** | **❌ 未修复** | **4 次** |
-| 20 | **untracked 文件统计遗漏** | **P1** | **05-11 14:30** | **❌ 未修复** | **3 次** |
-| 21 | **验收脚本瞬时回归缺少稳定性标记** | **P1** | **05-12 22:46** | **❌ 未修复（本轮再次证明单次 FAIL 可能下一轮恢复）** | **3 次** |
+| 19 | **review 误报传播** | **P1** | **05-11 14:30** | **❌ 未修复** | **5 次** |
+| 20 | **untracked 文件统计遗漏** | **P1** | **05-11 14:30** | **❌ 未修复** | **4 次** |
+| 21 | **验收脚本瞬时回归缺少稳定性标记** | **P1** | **05-12 22:46** | **❌ 未修复（仍缺 transient / repeated / reproducible 标记）** | **4 次** |
 | 22 | **无 delta 场景缺少老化风险优先策略** | **P2** | **05-12 22:46** | **❌ 未修复** | **3 次** |
-| 23 | **日报归档路径门禁失配** | **P0** | **05-13 00:15** | **⚠️ 待复核（本轮未复现，当前 `verify_phase6.sh` 已 PASS）** | **1 次** |
-| 24 | **综合验收错误聚合误导根因判断** | **P1** | **05-13 00:15** | **❌ 未修复** | **1 次** |
+| 23 | **日报归档路径门禁失配** | **P0** | **05-13 00:15** | **⚠️ 待复核（05-14 上午与下午均未复现）** | **1 次** |
+| 24 | **综合验收错误聚合误导根因判断** | **P1** | **05-13 00:15** | **❌ 未修复** | **2 次** |
+| 25 | **snapshot truth 与 current truth 漂移未被显式提示** | **P1** | **05-14 09:31** | **❌ 未修复** | **2 次** |
+| 26 | **Phase 6 稳定性门禁失败缺少样本窗口摘要** | **P1** | **05-14 15:10** | **❌ 未修复** | **1 次** |

 ---

 ## Review 日志

+### 2026-05-14 15:10（第 20 次 review，afternoon-review）
+
+> **前置说明**：距上一次 review（05-14 09:31）约 **5 小时 39 分钟**。本轮不是“无 delta”场景：上午 live 结论还是“唯一 blocker 为 CI 文件缺失”，但下午 `verify_phase6.sh` 新增暴露了第二个活跃 FAIL——**最近 7 次采集成功率未达到 95%**。这说明上午结论到下午已老化，必须显式更新 current truth，而不能机械复用旧短句。**
+
+#### 本次新增发现
+
+- **功能主链路仍可运行**：`verify_pre_phase6.sh` 中 Phase 1~4 继续 PASS，Go 测试、真实采集、API、前端测试入口仍正常。
+- **生产级总门禁当前有两个真实 blocker**：不仅有 Phase 5 的 `.github/workflows/ci.yml` 缺失，还有 Phase 6 顶层新增的“最近 7 次采集成功率达到 95%” FAIL。
+- **上午 review 结论到下午已过时**：如果继续沿用“只剩 CI 缺失”，会遗漏当前更接近生产稳定性的真实问题。
+- **顶层稳定性门禁缺少失败样本摘要**：当前只能看见阈值未达标，不能直接从输出中看到 7 次样本窗口明细与失败分布。
+
+#### 问题 26（P1）：Phase 6 稳定性门禁失败缺少样本窗口摘要
+
+- **15:10 状态**：`bash scripts/verify_phase6.sh` 当前 FAIL 于 `最近 7 次采集成功率达到 95%`，但顶层输出未直接打印 7 次样本 success/fail 明细，也未显示失败记录的时间点或原因。
+- **问题影响**：
+  - review 能知道“没过线”，但不能立刻知道“为什么没过线”
+  - 人工需要额外补查数据库或脚本，增加定位成本
+  - 容易把稳定性问题简化成单次瞬时波动，或反过来把单次波动误判成结构性回归
+- **优化建议**：
+  1. 在 `verify_phase6.sh` 的成功率门禁失败时，直接追加最近 7 次 `collector_stats` 的时间、success、source、error 摘要
+  2. 把输出拆成两层：`threshold failed` + `sample window details`
+  3. 若失败原因为单一 source 或单一时间窗口，输出聚合计数，方便区分系统性问题与单次异常
+- **优先级**：P1
+- **建议验证方法**：人为制造一条失败采集记录或在测试环境插入样本，确认 `verify_phase6.sh` 失败时能直接打印最近 7 次窗口详情，而不是只给阈值结论
+
+#### 问题 25（P1，再次确认）：snapshot truth 与 current truth 漂移未被显式提示
+
+- **15:10 状态**：上午报告中的“直接阻塞只剩 CI 缺失”到下午已经不成立，因为 afternoon live verifier 新增了稳定性门禁 FAIL。
+- **问题影响**：
+  - 若 review 系统不显式提醒“旧结论已老化”，读者容易把上午 snapshot 当成下午 current truth
+  - 会让 backlog 和日报式 review 低估动态门禁的时效性
+- **优化建议**：
+  1. 在 review 模板中保留并强化“与上一次 review 的 delta”字段
+  2. 当 live verifier 结果与上一轮关键短句不一致时，要求显式写出“旧口径已失效”
+  3. backlog 中对这类问题持续单独计数，不与普通误报混写
+- **优先级**：P1
+- **建议验证方法**：后续若同日两次 review 间 verifier 结果变化，检查新报告是否明确标出“旧结论已失效/已老化”而非沿用旧摘要
+
+#### 问题 24（P1，仍未修复）：综合验收错误聚合误导根因判断
+
+- **15:10 状态**：本轮再次证明顶层 `verify_phase6.sh` 只会聚合出 `Phase 1~5 总门禁通过` FAIL，仍需要继续手工下钻到 `verify_pre_phase6.sh` 才能定位具体失败落在 Phase 5。
+- **问题影响**：
+  - 若 review 停在顶层，会把聚合 FAIL 当成根因
+  - 一旦叠加其他顶层 FAIL（本轮就是成功率门禁），读者更难区分“生产收口问题”与“运行稳定性问题”
+- **优化建议**：
+  1. `verify_phase6.sh` 失败时直接回显 `verify_pre_phase6.sh` 的失败 phase 名称
+  2. 失败输出按 blocker 分类：`artifact missing`、`runtime instability`、`aggregated dependency fail`
+- **优先级**：P1
+- **建议验证方法**：人为保持一个子 phase 失败并再触发一个顶层附加 FAIL，确认输出能直接区分不同 blocker 类别
+
+---
+
 ### 2026-05-13 09:30（第 18 次 review，morning-review）

 > **前置说明**：距上一次 review（05-13 00:15）约 **9 小时 15 分钟**。本轮仓库状态的关键 delta 是：上一轮记录为 FAIL 的 `verify_phase6.sh`，本轮实际执行恢复为 **PASS**。这说明上一轮暴露的归档门禁问题当前未复现；与之相对，版本控制停滞与大量 untracked 仍无 delta，继续是最老化、最真实的系统性风险。**
@@ -113,4 +167,4 @@

 ---

-*Backlog 最后更新：2026-05-13 09:30 Asia/Shanghai*
+*Backlog 最后更新：2026-05-14 15:10 Asia/Shanghai*
--- a/scripts/fetch_multi_source.go
+++ b/scripts/fetch_multi_source.go
@@ -64,11 +64,14 @@ type sourceDefinition struct {
 }

 type runSummary struct {
-	SelectedSources   int
-	SuccessfulSources int
-	TotalModels       int
-	DomesticModels    int
-	CurrencyCounts    map[string]int
+	SelectedSources      int
+	SelectedSourceKeys   []string
+	SuccessfulSources    int
+	SuccessfulSourceKeys []string
+	FailedSourceKeys     []string
+	TotalModels          int
+	DomesticModels       int
+	CurrencyCounts       map[string]int
 }

 type pricingMetadataFields struct {
@@ -256,12 +259,15 @@ func listSourceKeys(apiKey string) []string {
 	return keys
 }

-func summarizePrices(selectedSources int, successfulSources int, prices []ModelPricing) runSummary {
+func summarizePrices(selectedSourceKeys []string, successfulSourceKeys []string, failedSourceKeys []string, prices []ModelPricing) runSummary {
 	summary := runSummary{
-		SelectedSources:   selectedSources,
-		SuccessfulSources: successfulSources,
-		TotalModels:       len(prices),
-		CurrencyCounts:    make(map[string]int),
+		SelectedSources:      len(selectedSourceKeys),
+		SelectedSourceKeys:   append([]string(nil), selectedSourceKeys...),
+		SuccessfulSources:    len(successfulSourceKeys),
+		SuccessfulSourceKeys: append([]string(nil), successfulSourceKeys...),
+		FailedSourceKeys:     append([]string(nil), failedSourceKeys...),
+		TotalModels:          len(prices),
+		CurrencyCounts:       make(map[string]int),
 	}
 	for _, price := range prices {
 		if strings.EqualFold(price.ProviderCountry, "CN") {
@@ -272,6 +278,21 @@ func summarizePrices(selectedSources int, successfulSources int, prices []ModelP
 	return summary
 }

+func sourceKey(src DataSource) string {
+	switch strings.ToLower(strings.TrimSpace(src.Name())) {
+	case "openrouter":
+		return "openrouter"
+	case "moonshot":
+		return "moonshot"
+	case "deepseek":
+		return "deepseek"
+	case "openai":
+		return "openai"
+	default:
+		return strings.ToLower(strings.ReplaceAll(strings.TrimSpace(src.Name()), " ", "_"))
+	}
+}
+
 func formatCountMap(counts map[string]int) string {
 	if len(counts) == 0 {
 		return "none"
@@ -289,17 +310,27 @@ func formatCountMap(counts map[string]int) string {
 	return strings.Join(parts, ",")
 }

+func formatKeyList(keys []string) string {
+	if len(keys) == 0 {
+		return "none"
+	}
+	return strings.Join(keys, ",")
+}
+
 func printSummary(w io.Writer, summary runSummary) error {
 	if w == nil {
 		return nil
 	}
 	_, err := fmt.Fprintf(
 		w,
-		"sources=%d successful_sources=%d models=%d domestic_models=%d currencies=%s\n",
+		"sources=%d successful_sources=%d models=%d domestic_models=%d selected_source_keys=%s successful_source_keys=%s failed_source_keys=%s currencies=%s\n",
 		summary.SelectedSources,
 		summary.SuccessfulSources,
 		summary.TotalModels,
 		summary.DomesticModels,
+		formatKeyList(summary.SelectedSourceKeys),
+		formatKeyList(summary.SuccessfulSourceKeys),
+		formatKeyList(summary.FailedSourceKeys),
 		formatCountMap(summary.CurrencyCounts),
 	)
 	return err
@@ -564,23 +595,29 @@ func defaultDSN() string {

 func runCollector(cfg runConfig, sources []DataSource, saveFn func([]ModelPricing) error, out io.Writer) error {
 	allPrices := make([]ModelPricing, 0)
-	successfulSources := 0
+	selectedSourceKeys := make([]string, 0, len(sources))
+	successfulSourceKeys := make([]string, 0, len(sources))
+	failedSourceKeys := make([]string, 0)

 	for _, src := range sources {
+		key := sourceKey(src)
+		selectedSourceKeys = append(selectedSourceKeys, key)
+
 		prices, err := src.FetchPricing()
 		if err != nil {
 			logger.Error("采集失败", "source", src.Name(), "error", err)
+			failedSourceKeys = append(failedSourceKeys, key)
 			continue
 		}
-		successfulSources++
+		successfulSourceKeys = append(successfulSourceKeys, key)
 		allPrices = append(allPrices, prices...)
 	}

-	summary := summarizePrices(len(sources), successfulSources, allPrices)
+	summary := summarizePrices(selectedSourceKeys, successfulSourceKeys, failedSourceKeys, allPrices)
 	if err := printSummary(out, summary); err != nil {
 		return err
 	}
-	if successfulSources == 0 {
+	if summary.SuccessfulSources == 0 {
 		return fmt.Errorf("no data source collected successfully")
 	}
 	if cfg.DryRun {
@@ -593,7 +630,7 @@ func runCollector(cfg runConfig, sources []DataSource, saveFn func([]ModelPricin
 		return err
 	}

-	logger.Info("多源采集完成", "total_models", len(allPrices), "sources", successfulSources)
+	logger.Info("多源采集完成", "total_models", len(allPrices), "sources", summary.SuccessfulSources)
 	return nil
 }

--- a/scripts/fetch_multi_source_test.go
+++ b/scripts/fetch_multi_source_test.go
@@ -90,6 +90,49 @@ func TestRunCollectorDryRunSkipsDatabaseWrite(t *testing.T) {
 	if !bytes.Contains(out.Bytes(), []byte("currencies=CNY:2,USD:1")) {
 		t.Fatalf("expected currency summary, got %q", output)
 	}
+	if !bytes.Contains(out.Bytes(), []byte("selected_source_keys=moonshot,openai")) {
+		t.Fatalf("expected selected source keys in summary, got %q", output)
+	}
+	if !bytes.Contains(out.Bytes(), []byte("successful_source_keys=moonshot,openai")) {
+		t.Fatalf("expected successful source keys in summary, got %q", output)
+	}
+	if !bytes.Contains(out.Bytes(), []byte("failed_source_keys=none")) {
+		t.Fatalf("expected failed source keys in summary, got %q", output)
+	}
+}
+
+func TestRunCollectorReportsFailedSourceKeys(t *testing.T) {
+	cfg := runConfig{DryRun: true}
+	var out bytes.Buffer
+
+	err := runCollector(
+		cfg,
+		[]DataSource{
+			fakeSource{
+				name: "Moonshot",
+				prices: []ModelPricing{
+					{ModelID: "kimi-k2.6", ProviderCountry: "CN", Currency: "CNY"},
+				},
+			},
+			fakeSource{
+				name: "OpenAI",
+				err:  bytes.ErrTooLarge,
+			},
+		},
+		nil,
+		&out,
+	)
+	if err != nil {
+		t.Fatalf("runCollector returned error: %v", err)
+	}
+
+	output := out.String()
+	if !bytes.Contains(out.Bytes(), []byte("successful_source_keys=moonshot")) {
+		t.Fatalf("expected successful source keys in summary, got %q", output)
+	}
+	if !bytes.Contains(out.Bytes(), []byte("failed_source_keys=openai")) {
+		t.Fatalf("expected failed source keys in summary, got %q", output)
+	}
 }

 func TestPricingMetadataClassifiesSourceType(t *testing.T) {
--- a/scripts/fetch_openrouter.go
+++ b/scripts/fetch_openrouter.go
@@ -33,6 +33,7 @@ type Config struct {
 	TimeoutSec int
 	BatchSize  int
 	DBConn     string
+	StrictReal bool
 }

 // ModelInfo 模型信息（与 collectors 包兼容）
@@ -99,6 +100,7 @@ func parseArgs() Config {
 	timeoutSec := flag.Int("timeout", 30, "请求超时（秒）")
 	batchSize := flag.Int("batch", 100, "批量插入批次大小")
 	dbConn := flag.String("db", os.Getenv("DATABASE_URL"), "PostgreSQL 连接字符串")
+	strictReal := flag.Bool("strict-real", false, "严格真实模式：缺少 API Key 或数据库写入失败时返回错误")
 	flag.Parse()
 	return Config{
 		APIKey:     *apiKey,
@@ -108,6 +110,7 @@ func parseArgs() Config {
 		TimeoutSec: *timeoutSec,
 		BatchSize:  *batchSize,
 		DBConn:     *dbConn,
+		StrictReal: *strictReal,
 	}
 }

@@ -158,6 +161,9 @@ func run(cfg Config) error {
 	if cfg.DBConn != "" {
 		if err := summarizeDB(cfg.DBConn, models, cfg.BatchSize); err != nil {
 			logger.Error("PostgreSQL 写入失败", "error", err)
+			if cfg.StrictReal {
+				return fmt.Errorf("PostgreSQL 写入失败: %w", err)
+			}
 			logger.Warn("降级为仅写入 JSON")
 		} else {
 			logger.Info("PostgreSQL 写入完成", "records", len(models))
@@ -169,6 +175,9 @@ func run(cfg Config) error {
 // fetchModels 抓取 OpenRouter 模型列表（集成指数退避重试）
 func fetchModels(cfg Config) ([]ModelInfo, error) {
 	if cfg.APIKey == "" {
+		if cfg.StrictReal {
+			return nil, fmt.Errorf("严格真实模式下必须提供 API Key")
+		}
 		logger.Warn("未提供 API Key，使用模拟数据")
 		return []ModelInfo{
 			{ID: "openai/gpt-4o", ContextLength: 128000, Pricing: ModelPricing{Input: 2.5, Output: 10.0}},
--- a/scripts/fetch_openrouter_test.go
+++ b/scripts/fetch_openrouter_test.go
@@ -4,6 +4,8 @@ package main

 import (
 	"encoding/json"
+	"net/http"
+	"net/http/httptest"
 	"os"
 	"path/filepath"
 	"testing"
@@ -98,3 +100,33 @@ func TestRunNoAPIKey(t *testing.T) {
 		t.Error("models 为空")
 	}
 }
+
+func TestFetchModelsFailsInStrictRealModeWithoutAPIKey(t *testing.T) {
+	_, err := fetchModels(Config{StrictReal: true})
+	if err == nil {
+		t.Fatal("strict real mode should fail without API key")
+	}
+}
+
+func TestRunFailsInStrictRealModeWhenDBWriteFails(t *testing.T) {
+	tmpDir := t.TempDir()
+	outPath := filepath.Join(tmpDir, "models.json")
+	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Content-Type", "application/json")
+		_, _ = w.Write([]byte(`{"data":[{"id":"openai/gpt-4o","name":"GPT-4o","context_length":128000,"pricing":{"input":2.5,"output":10.0}}]}`))
+	}))
+	defer server.Close()
+
+	err := run(Config{
+		APIKey:     "test-key",
+		APIURL:     server.URL,
+		OutPath:    outPath,
+		DBConn:     "postgres://invalid@127.0.0.1:1/invalid?sslmode=disable",
+		BatchSize:  10,
+		TimeoutSec: 1,
+		StrictReal: true,
+	})
+	if err == nil {
+		t.Fatal("strict real mode should fail when database write fails")
+	}
+}
--- a/scripts/generate_daily_report.go
+++ b/scripts/generate_daily_report.go
@@ -22,6 +22,13 @@ import (

 var logger *slog.Logger

+type ReportRunContext struct {
+	RunKind         string
+	TriggerSource   string
+	IsOfficialDaily bool
+	RuntimeAudit    string
+}
+
 func init() {
 	logger = slog.New(slog.NewJSONHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelInfo}))
 }
@@ -89,6 +96,14 @@ func run() error {
 	if err != nil {
 		return err
 	}
+	runContext := resolveReportRunContext(
+		date,
+		time.Now(),
+		os.Getenv("REPORT_RUN_KIND"),
+		os.Getenv("REPORT_TRIGGER_SOURCE"),
+		os.Getenv("REPORT_IS_OFFICIAL_DAILY"),
+		os.Getenv("REPORT_RUNTIME_AUDIT"),
+	)

 	// 1. 获取报告数据(使用新schema)
 	report, err := generateReportDataV3(db, date)
@@ -122,7 +137,7 @@ func run() error {
 	}

 	// 6. 同步写入日报状态与运行轨迹
-	if err := saveReportTrackingV3(db, report, mdPath); err != nil {
+	if err := saveReportTrackingV3(db, report, mdPath, runContext); err != nil {
 		logger.Warn("保存日报记录失败", "error", err)
 	}

@@ -165,6 +180,43 @@ func resolveReportDate(now time.Time, args []string, envDate string) (string, er
 	return parsed.Format("2006-01-02"), nil
 }

+func resolveReportRunContext(reportDate string, now time.Time, envRunKind, envTriggerSource, envOfficialDaily, envRuntimeAudit string) ReportRunContext {
+	runKind := strings.TrimSpace(envRunKind)
+	if runKind == "" {
+		runKind = "manual"
+	}
+
+	triggerSource := strings.TrimSpace(envTriggerSource)
+	if triggerSource == "" {
+		triggerSource = "cli"
+	}
+
+	isOfficialDaily := strings.EqualFold(strings.TrimSpace(envOfficialDaily), "true")
+	if strings.TrimSpace(envOfficialDaily) == "" && reportDate == now.Format("2006-01-02") && runKind == "scheduled" {
+		isOfficialDaily = true
+	}
+
+	return ReportRunContext{
+		RunKind:         runKind,
+		TriggerSource:   triggerSource,
+		IsOfficialDaily: isOfficialDaily,
+		RuntimeAudit:    strings.TrimSpace(envRuntimeAudit),
+	}
+}
+
+func composeTrackedSummary(summary string, runContext ReportRunContext) string {
+	runtimeAudit := strings.TrimSpace(runContext.RuntimeAudit)
+	summary = strings.TrimSpace(summary)
+
+	if runtimeAudit == "" {
+		return summary
+	}
+	if summary == "" {
+		return runtimeAudit
+	}
+	return runtimeAudit + "\n" + summary
+}
+
 // ============ 数据模型 ============

 const (
@@ -2869,11 +2921,12 @@ th {
 	return t.Execute(f, r)
 }

-func saveReportTrackingV3(db *sql.DB, r *ReportV3, mdPath string) error {
+func saveReportTrackingV3(db *sql.DB, r *ReportV3, mdPath string, runContext ReportRunContext) error {
 	summary := r.HeroSummary
 	if summary == "" {
 		summary = fmt.Sprintf("models=%d free=%d intl=%d domestic=%d", r.TotalModels, len(r.FreeModels), len(r.IntlTop5), len(r.DomesticTop10))
 	}
+	summary = composeTrackedSummary(summary, runContext)
 	tx, err := db.Begin()
 	if err != nil {
 		return err
@@ -2881,24 +2934,39 @@ func saveReportTrackingV3(db *sql.DB, r *ReportV3, mdPath string) error {
 	defer tx.Rollback()

 	if _, err := tx.Exec(`
-		INSERT INTO daily_report (report_date, status, model_count, new_models, free_models, summary_md, output_path, updated_at)
-		VALUES ($1, $2, $3, $4, $5, $6, $7, NOW())
+		INSERT INTO daily_report (report_date, status, model_count, new_models, free_models, summary_md, output_path, run_kind, trigger_source, is_official_daily, updated_at)
+		VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, NOW())
 		ON CONFLICT (report_date) DO UPDATE SET
 			status = EXCLUDED.status,
 			model_count = EXCLUDED.model_count,
 			free_models = EXCLUDED.free_models,
 			summary_md = EXCLUDED.summary_md,
 			output_path = EXCLUDED.output_path,
+			run_kind = CASE
+				WHEN EXCLUDED.is_official_daily THEN EXCLUDED.run_kind
+				WHEN daily_report.trigger_source = 'legacy_backfill' THEN EXCLUDED.run_kind
+				ELSE daily_report.run_kind
+			END,
+			trigger_source = CASE
+				WHEN EXCLUDED.is_official_daily THEN EXCLUDED.trigger_source
+				WHEN daily_report.trigger_source = 'legacy_backfill' THEN EXCLUDED.trigger_source
+				ELSE daily_report.trigger_source
+			END,
+			is_official_daily = CASE
+				WHEN EXCLUDED.is_official_daily THEN TRUE
+				WHEN daily_report.trigger_source = 'legacy_backfill' THEN EXCLUDED.is_official_daily
+				ELSE daily_report.is_official_daily
+			END,
 			error_message = NULL,
 			updated_at = NOW()
-	`, r.Date, "generated", r.TotalModels, 0, len(r.FreeModels), summary, mdPath); err != nil {
+	`, r.Date, "generated", r.TotalModels, 0, len(r.FreeModels), summary, mdPath, runContext.RunKind, runContext.TriggerSource, runContext.IsOfficialDaily); err != nil {
 		return err
 	}

 	if _, err := tx.Exec(`
-		INSERT INTO report_runs (source, report_date, status, summary_md, output_path, error_message)
-		VALUES ($1, $2, $3, $4, $5, NULL)
-	`, "generate_daily_report", r.Date, "generated", summary, mdPath); err != nil {
+		INSERT INTO report_runs (source, report_date, status, summary_md, output_path, error_message, run_kind, trigger_source, is_official_daily)
+		VALUES ($1, $2, $3, $4, $5, NULL, $6, $7, $8)
+	`, "generate_daily_report", r.Date, "generated", summary, mdPath, runContext.RunKind, runContext.TriggerSource, runContext.IsOfficialDaily); err != nil {
 		return err
 	}

--- a/scripts/generate_daily_report_test.go
+++ b/scripts/generate_daily_report_test.go
@@ -227,6 +227,61 @@ func TestResolveReportDateRejectsInvalidDate(t *testing.T) {
 	}
 }

+func TestResolveReportRunContextDefaultsToManualCLI(t *testing.T) {
+	ctx := resolveReportRunContext("2026-05-14", time.Date(2026, 5, 14, 8, 0, 0, 0, time.FixedZone("CST", 8*3600)), "", "", "", "")
+	if ctx.RunKind != "manual" {
+		t.Fatalf("run kind = %q, want manual", ctx.RunKind)
+	}
+	if ctx.TriggerSource != "cli" {
+		t.Fatalf("trigger source = %q, want cli", ctx.TriggerSource)
+	}
+	if ctx.IsOfficialDaily {
+		t.Fatalf("manual run should not be official daily: %+v", ctx)
+	}
+}
+
+func TestResolveReportRunContextHonorsScheduledEnv(t *testing.T) {
+	ctx := resolveReportRunContext("2026-05-14", time.Date(2026, 5, 14, 8, 0, 0, 0, time.FixedZone("CST", 8*3600)), "scheduled", "cron", "true", "")
+	if ctx.RunKind != "scheduled" || ctx.TriggerSource != "cron" || !ctx.IsOfficialDaily {
+		t.Fatalf("unexpected scheduled context: %+v", ctx)
+	}
+}
+
+func TestResolveReportRunContextMarksHistoricalRebuildAsNonOfficial(t *testing.T) {
+	ctx := resolveReportRunContext("2025-08-07", time.Date(2026, 5, 14, 8, 0, 0, 0, time.FixedZone("CST", 8*3600)), "historical_rebuild", "rebuild_script", "false", "")
+	if ctx.RunKind != "historical_rebuild" {
+		t.Fatalf("run kind = %q, want historical_rebuild", ctx.RunKind)
+	}
+	if ctx.TriggerSource != "rebuild_script" {
+		t.Fatalf("trigger source = %q, want rebuild_script", ctx.TriggerSource)
+	}
+	if ctx.IsOfficialDaily {
+		t.Fatalf("historical rebuild should not be official daily: %+v", ctx)
+	}
+}
+
+func TestComposeTrackedSummaryPrependsRuntimeAudit(t *testing.T) {
+	summary := composeTrackedSummary(
+		"models=42 free=3 intl=5 domestic=10",
+		ReportRunContext{
+			RunKind:         "scheduled",
+			TriggerSource:   "cron",
+			IsOfficialDaily: true,
+			RuntimeAudit:    "runtime_audit stage_set=openrouter,multi_source,official_imports,daily_report selected_source_keys=openrouter,moonshot,deepseek,openai,zhipu,baidu,bytedance failed_source_keys=none",
+		},
+	)
+
+	if !strings.Contains(summary, "runtime_audit stage_set=openrouter,multi_source,official_imports,daily_report") {
+		t.Fatalf("expected runtime audit in tracked summary, got %q", summary)
+	}
+	if !strings.Contains(summary, "failed_source_keys=none") {
+		t.Fatalf("expected failed source keys in tracked summary, got %q", summary)
+	}
+	if !strings.Contains(summary, "models=42 free=3 intl=5 domestic=10") {
+		t.Fatalf("expected report summary to be preserved, got %q", summary)
+	}
+}
+
 func TestDecorateReportV1BuildsHotDaySummary(t *testing.T) {
 	report := sampleReportForV1()
 	report.ModelEvents = []ModelEvent{
--- a/scripts/rebuild_historical_report.sh
+++ b/scripts/rebuild_historical_report.sh
@@ -21,4 +21,8 @@ if [[ -f ".env" ]]; then
  source ".env"
 fi

-REPORT_DATE="$REPORT_DATE" go run -tags llm_script ./scripts/generate_daily_report.go "$@"
+REPORT_DATE="$REPORT_DATE" \
+REPORT_RUN_KIND="historical_rebuild" \
+REPORT_TRIGGER_SOURCE="rebuild_script" \
+REPORT_IS_OFFICIAL_DAILY="false" \
+go run -tags llm_script ./scripts/generate_daily_report.go "$@"
--- a/scripts/report_utils.sh
+++ b/scripts/report_utils.sh
@@ -55,7 +55,7 @@ archive_report_artifacts() {
 }

 track_report_state() {
-    local db_url report_date status model_count summary_md output_path error_message
+    local db_url report_date status model_count summary_md output_path error_message run_kind trigger_source is_official_daily
    db_url="$1"
    report_date="$2"
    status="$3"
@@ -63,6 +63,9 @@ track_report_state() {
    summary_md="${5:-}"
    output_path="${6:-}"
    error_message="${7:-}"
+    run_kind="${8:-manual}"
+    trigger_source="${9:-cli}"
+    is_official_daily="${10:-false}"

    psql "$db_url" \
        -v ON_ERROR_STOP=1 \
@@ -71,7 +74,10 @@ track_report_state() {
        --set=model_count="$model_count" \
        --set=summary_md="$summary_md" \
        --set=output_path="$output_path" \
-        --set=error_message="$error_message" <<'SQL'
+        --set=error_message="$error_message" \
+        --set=run_kind="$run_kind" \
+        --set=trigger_source="$trigger_source" \
+        --set=is_official_daily="$is_official_daily" <<'SQL'
 INSERT INTO daily_report (
    report_date,
    status,
@@ -79,6 +85,9 @@ INSERT INTO daily_report (
    summary_md,
    output_path,
    error_message,
+    run_kind,
+    trigger_source,
+    is_official_daily,
    created_at,
    updated_at
 )
@@ -89,6 +98,9 @@ VALUES (
    NULLIF(:'summary_md', ''),
    NULLIF(:'output_path', ''),
    NULLIF(:'error_message', ''),
+    NULLIF(:'run_kind', ''),
+    NULLIF(:'trigger_source', ''),
+    NULLIF(:'is_official_daily', '')::BOOLEAN,
    NOW(),
    NOW()
 )
@@ -98,6 +110,21 @@ ON CONFLICT (report_date) DO UPDATE SET
    summary_md = COALESCE(EXCLUDED.summary_md, daily_report.summary_md),
    output_path = COALESCE(EXCLUDED.output_path, daily_report.output_path),
    error_message = EXCLUDED.error_message,
+    run_kind = CASE
+        WHEN EXCLUDED.is_official_daily THEN EXCLUDED.run_kind
+        WHEN daily_report.trigger_source = 'legacy_backfill' THEN EXCLUDED.run_kind
+        ELSE daily_report.run_kind
+    END,
+    trigger_source = CASE
+        WHEN EXCLUDED.is_official_daily THEN EXCLUDED.trigger_source
+        WHEN daily_report.trigger_source = 'legacy_backfill' THEN EXCLUDED.trigger_source
+        ELSE daily_report.trigger_source
+    END,
+    is_official_daily = CASE
+        WHEN EXCLUDED.is_official_daily THEN TRUE
+        WHEN daily_report.trigger_source = 'legacy_backfill' THEN EXCLUDED.is_official_daily
+        ELSE daily_report.is_official_daily
+    END,
    updated_at = NOW();

 INSERT INTO report_runs (
@@ -106,7 +133,10 @@ INSERT INTO report_runs (
    status,
    summary_md,
    output_path,
-    error_message
+    error_message,
+    run_kind,
+    trigger_source,
+    is_official_daily
 )
 VALUES (
    'pipeline',
@@ -114,7 +144,10 @@ VALUES (
    :'status',
    NULLIF(:'summary_md', ''),
    NULLIF(:'output_path', ''),
-    NULLIF(:'error_message', '')
+    NULLIF(:'error_message', ''),
+    NULLIF(:'run_kind', ''),
+    NULLIF(:'trigger_source', ''),
+    NULLIF(:'is_official_daily', '')::BOOLEAN
 );
 SQL
 }
--- a/scripts/run_daily.sh
+++ b/scripts/run_daily.sh
@@ -5,27 +5,72 @@ set -euo pipefail

 PROJECT_DIR="/home/long/project/llm-intelligence"
 . "$PROJECT_DIR/scripts/report_utils.sh"
+if [[ -f "$PROJECT_DIR/.env.local" ]]; then
+    # shellcheck disable=SC1091
+    source "$PROJECT_DIR/.env.local"
+fi
+if [[ -f "$PROJECT_DIR/.env" ]]; then
+    # shellcheck disable=SC1091
+    source "$PROJECT_DIR/.env"
+fi
 DB_URL="${DATABASE_URL:-host=/var/run/postgresql dbname=llm_intelligence user=long sslmode=disable}"
 REPORT_DATE="$(report_date_value)"
 LOG_FILE="/tmp/llm_hub_daily_${REPORT_DATE}.log"
 FEISHU_WEBHOOK="${FEISHU_WEBHOOK:-}"
 MODEL_COUNT=""
+FETCH_OUT="${PROJECT_DIR}/models.json"
+FETCH_TOTAL="0"
+PIPELINE_STAGE_SET="openrouter,multi_source,official_imports,daily_report"
+PIPELINE_SOURCE_SET="openrouter,moonshot,deepseek,openai,zhipu,baidu,bytedance"
+PIPELINE_FAILED_SOURCE_SET="none"
+MULTI_SOURCE_AUDIT="multi_source_audit=unavailable"
+PIPELINE_AUDIT_SUMMARY=""

 # 日志函数
 log() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
 }

+normalize_summary_file() {
+    local path="$1"
+    if [ ! -f "$path" ]; then
+        return
+    fi
+    tr '\n' ' ' < "$path" | sed 's/[[:space:]]\+/ /g; s/^ //; s/ $//'
+}
+
+extract_failed_source_keys() {
+    local summary="$1"
+    printf '%s\n' "$summary" | sed -n 's/.*failed_source_keys=\([^ ]*\).*/\1/p'
+}
+
+merge_failed_source_keys() {
+    local keys="$1"
+    if [ -z "$keys" ] || [ "$keys" = "none" ]; then
+        return
+    fi
+    if [ "$PIPELINE_FAILED_SOURCE_SET" = "none" ]; then
+        PIPELINE_FAILED_SOURCE_SET="$keys"
+        return
+    fi
+    PIPELINE_FAILED_SOURCE_SET="${PIPELINE_FAILED_SOURCE_SET},${keys}"
+}
+
+refresh_pipeline_audit() {
+    PIPELINE_AUDIT_SUMMARY="runtime_audit stage_set=${PIPELINE_STAGE_SET} selected_source_keys=${PIPELINE_SOURCE_SET} failed_source_keys=${PIPELINE_FAILED_SOURCE_SET} openrouter_total=${FETCH_TOTAL:-0} ${MULTI_SOURCE_AUDIT}"
+}
+
 # 错误处理
 error_exit() {
    local output_path=""
    log "❌ 错误: $1"
+    refresh_pipeline_audit
    # 降级：复制昨日报告
    fallback_report
    if [ -f "$(report_markdown_path "$REPORT_DATE")" ]; then
        output_path="$(report_markdown_path "$REPORT_DATE")"
    fi
-    track_report_state "$DB_URL" "$REPORT_DATE" "failed" "${MODEL_COUNT:-}" "" "$output_path" "$1" >> "$LOG_FILE" 2>&1 || true
+    track_report_state "$DB_URL" "$REPORT_DATE" "failed" "${MODEL_COUNT:-}" "$PIPELINE_AUDIT_SUMMARY" "$output_path" "$1" "scheduled" "cron" "true" >> "$LOG_FILE" 2>&1 || true
    # 发送告警
    if [ -n "$FEISHU_WEBHOOK" ]; then
        send_alert "$1"
@@ -33,6 +78,8 @@ error_exit() {
    exit 1
 }

+refresh_pipeline_audit
+
 # 降级：复制昨日报告
 fallback_report() {
    local yesterday yesterday_md today_md yesterday_html today_html
@@ -77,11 +124,66 @@ cd "$PROJECT_DIR"

 # 1. 数据采集
 log "1️⃣ 数据采集..."
-if ! go run scripts/fetch_openrouter.go >> "$LOG_FILE" 2>&1; then
+if ! go run scripts/fetch_openrouter.go -strict-real -out "$FETCH_OUT" >> "$LOG_FILE" 2>&1; then
+    merge_failed_source_keys "openrouter"
    error_exit "数据采集失败"
 fi
+FETCH_TOTAL=$(python3 - <<'PY' "$FETCH_OUT"
+import json, sys
+path = sys.argv[1]
+with open(path, 'r', encoding='utf-8') as f:
+    data = json.load(f)
+print(int(data.get("total", 0)))
+PY
+)
+if [ "${FETCH_TOTAL:-0}" -lt 10 ]; then
+    merge_failed_source_keys "openrouter"
+    error_exit "本次采集结果异常: total=${FETCH_TOTAL:-0} < 10"
+fi
+refresh_pipeline_audit
 log "✅ 数据采集完成"

+# 1.5 多源补充同步
+log "1️⃣➕ 多源补充同步..."
+MULTI_SOURCE_OUTPUT="$(mktemp)"
+if ! go run scripts/fetch_multi_source.go --sources moonshot,deepseek,openai > "$MULTI_SOURCE_OUTPUT" 2>> "$LOG_FILE"; then
+    MULTI_SOURCE_SUMMARY="$(normalize_summary_file "$MULTI_SOURCE_OUTPUT")"
+    if [ -n "$MULTI_SOURCE_SUMMARY" ]; then
+        MULTI_SOURCE_AUDIT="multi_source_audit=${MULTI_SOURCE_SUMMARY}"
+        merge_failed_source_keys "$(extract_failed_source_keys "$MULTI_SOURCE_SUMMARY")"
+    else
+        MULTI_SOURCE_AUDIT="multi_source_audit=stage_failed"
+        merge_failed_source_keys "moonshot,deepseek,openai"
+    fi
+    cat "$MULTI_SOURCE_OUTPUT" >> "$LOG_FILE"
+    rm -f "$MULTI_SOURCE_OUTPUT"
+    error_exit "多源补充同步失败"
+fi
+MULTI_SOURCE_SUMMARY="$(normalize_summary_file "$MULTI_SOURCE_OUTPUT")"
+MULTI_SOURCE_AUDIT="multi_source_audit=${MULTI_SOURCE_SUMMARY:-none}"
+merge_failed_source_keys "$(extract_failed_source_keys "$MULTI_SOURCE_SUMMARY")"
+refresh_pipeline_audit
+cat "$MULTI_SOURCE_OUTPUT" >> "$LOG_FILE"
+rm -f "$MULTI_SOURCE_OUTPUT"
+if ! go run -tags llm_script scripts/import_zhipu_data.go >> "$LOG_FILE" 2>&1; then
+    merge_failed_source_keys "zhipu"
+    error_exit "智谱官方导入失败"
+fi
+if ! go run -tags llm_script scripts/export_official_seed_json.go >> "$LOG_FILE" 2>&1; then
+    merge_failed_source_keys "official_seed_export"
+    error_exit "官方种子导出失败"
+fi
+if ! go run -tags llm_script scripts/import_phase2_data.go >> "$LOG_FILE" 2>&1; then
+    merge_failed_source_keys "baidu"
+    error_exit "百度官方导入失败"
+fi
+if ! go run -tags llm_script scripts/import_bytedance_data.go >> "$LOG_FILE" 2>&1; then
+    merge_failed_source_keys "bytedance"
+    error_exit "字节官方导入失败"
+fi
+refresh_pipeline_audit
+log "✅ 多源补充同步完成"
+
 # 2. 数据质量检查
 log "2️⃣ 数据质量检查..."
 MODEL_COUNT=$(psql "$DB_URL" -t -c "SELECT COUNT(*) FROM models WHERE deleted_at IS NULL" 2>/dev/null | tr -d ' ')
@@ -93,7 +195,7 @@ log "✅ 数据质量检查通过 (模型数: ${MODEL_COUNT})"
 # 3. 生成日报
 log "3️⃣ 生成日报..."
 export DATABASE_URL="$DB_URL"
-if ! go run scripts/generate_daily_report.go >> "$LOG_FILE" 2>&1; then
+if ! REPORT_RUN_KIND="scheduled" REPORT_TRIGGER_SOURCE="cron" REPORT_IS_OFFICIAL_DAILY="true" REPORT_RUNTIME_AUDIT="$PIPELINE_AUDIT_SUMMARY" go run scripts/generate_daily_report.go >> "$LOG_FILE" 2>&1; then
    error_exit "日报生成失败"
 fi
 log "✅ 日报生成完成"
--- a/scripts/run_real_pipeline.sh
+++ b/scripts/run_real_pipeline.sh
@@ -25,30 +25,133 @@ if [[ -z "${OPENROUTER_API_KEY:-}" ]]; then
 fi

 REPORT_DATE="$(report_date_value)"
+FETCH_OUT="$ROOT_DIR/models.json"
+FETCH_TOTAL="0"
+PIPELINE_STAGE_SET="openrouter,multi_source,official_imports,daily_report"
+PIPELINE_SOURCE_SET="openrouter,moonshot,deepseek,openai,zhipu,baidu,bytedance"
+PIPELINE_FAILED_SOURCE_SET="none"
+MULTI_SOURCE_AUDIT="multi_source_audit=unavailable"
+PIPELINE_AUDIT_SUMMARY=""
+
+normalize_summary_file() {
+  local path="$1"
+  if [[ ! -f "$path" ]]; then
+    return
+  fi
+  tr '\n' ' ' < "$path" | sed 's/[[:space:]]\+/ /g; s/^ //; s/ $//'
+}
+
+extract_failed_source_keys() {
+  local summary="$1"
+  printf '%s\n' "$summary" | sed -n 's/.*failed_source_keys=\([^ ]*\).*/\1/p'
+}
+
+merge_failed_source_keys() {
+  local keys="$1"
+  if [[ -z "$keys" || "$keys" == "none" ]]; then
+    return
+  fi
+  if [[ "$PIPELINE_FAILED_SOURCE_SET" == "none" ]]; then
+    PIPELINE_FAILED_SOURCE_SET="$keys"
+    return
+  fi
+  PIPELINE_FAILED_SOURCE_SET="${PIPELINE_FAILED_SOURCE_SET},${keys}"
+}
+
+refresh_pipeline_audit() {
+  PIPELINE_AUDIT_SUMMARY="runtime_audit stage_set=${PIPELINE_STAGE_SET} selected_source_keys=${PIPELINE_SOURCE_SET} failed_source_keys=${PIPELINE_FAILED_SOURCE_SET} openrouter_total=${FETCH_TOTAL:-0} ${MULTI_SOURCE_AUDIT}"
+}

 record_failure() {
  local error_message output_path
  error_message="$1"
  output_path=""
+  refresh_pipeline_audit

  if [[ -f "$(report_markdown_path "$REPORT_DATE")" ]]; then
    output_path="$(report_markdown_path "$REPORT_DATE")"
  fi

-  track_report_state "$DATABASE_URL" "$REPORT_DATE" "failed" "" "" "$output_path" "$error_message" >/dev/null 2>&1 || true
+  track_report_state "$DATABASE_URL" "$REPORT_DATE" "failed" "" "$PIPELINE_AUDIT_SUMMARY" "$output_path" "$error_message" "manual" "pipeline" "false" >/dev/null 2>&1 || true
 }

+refresh_pipeline_audit
+
 "$ROOT_DIR/scripts/apply_migration.sh"

 if ! go run "./scripts/fetch_openrouter.go" \
  -api-key "$OPENROUTER_API_KEY" \
  -db "$DATABASE_URL" \
-  -out "$ROOT_DIR/models.json"; then
+  -out "$FETCH_OUT" \
+  -strict-real; then
+  merge_failed_source_keys "openrouter"
  record_failure "真实采集失败"
  exit 1
 fi

-if ! go run "./scripts/generate_daily_report.go"; then
+FETCH_TOTAL=$(python3 - <<'PY' "$FETCH_OUT"
+import json, sys
+path = sys.argv[1]
+with open(path, 'r', encoding='utf-8') as f:
+    data = json.load(f)
+print(int(data.get("total", 0)))
+PY
+)
+if [[ "${FETCH_TOTAL:-0}" -lt 10 ]]; then
+  merge_failed_source_keys "openrouter"
+  record_failure "本次采集结果异常: total=${FETCH_TOTAL:-0} < 10"
+  exit 1
+fi
+refresh_pipeline_audit
+
+MULTI_SOURCE_OUTPUT="$(mktemp)"
+if ! go run "./scripts/fetch_multi_source.go" --sources moonshot,deepseek,openai > "$MULTI_SOURCE_OUTPUT"; then
+  MULTI_SOURCE_SUMMARY="$(normalize_summary_file "$MULTI_SOURCE_OUTPUT")"
+  if [[ -n "$MULTI_SOURCE_SUMMARY" ]]; then
+    MULTI_SOURCE_AUDIT="multi_source_audit=${MULTI_SOURCE_SUMMARY}"
+    merge_failed_source_keys "$(extract_failed_source_keys "$MULTI_SOURCE_SUMMARY")"
+  else
+    MULTI_SOURCE_AUDIT="multi_source_audit=stage_failed"
+    merge_failed_source_keys "moonshot,deepseek,openai"
+  fi
+  cat "$MULTI_SOURCE_OUTPUT"
+  rm -f "$MULTI_SOURCE_OUTPUT"
+  record_failure "多源补充同步失败"
+  exit 1
+fi
+MULTI_SOURCE_SUMMARY="$(normalize_summary_file "$MULTI_SOURCE_OUTPUT")"
+MULTI_SOURCE_AUDIT="multi_source_audit=${MULTI_SOURCE_SUMMARY:-none}"
+merge_failed_source_keys "$(extract_failed_source_keys "$MULTI_SOURCE_SUMMARY")"
+refresh_pipeline_audit
+cat "$MULTI_SOURCE_OUTPUT"
+rm -f "$MULTI_SOURCE_OUTPUT"
+
+if ! go run -tags llm_script "./scripts/import_zhipu_data.go"; then
+  merge_failed_source_keys "zhipu"
+  record_failure "智谱官方导入失败"
+  exit 1
+fi
+
+if ! go run -tags llm_script "./scripts/export_official_seed_json.go"; then
+  merge_failed_source_keys "official_seed_export"
+  record_failure "官方种子导出失败"
+  exit 1
+fi
+
+if ! go run -tags llm_script "./scripts/import_phase2_data.go"; then
+  merge_failed_source_keys "baidu"
+  record_failure "百度官方导入失败"
+  exit 1
+fi
+
+if ! go run -tags llm_script "./scripts/import_bytedance_data.go"; then
+  merge_failed_source_keys "bytedance"
+  record_failure "字节官方导入失败"
+  exit 1
+fi
+refresh_pipeline_audit
+
+if ! REPORT_RUN_KIND="manual" REPORT_TRIGGER_SOURCE="pipeline" REPORT_IS_OFFICIAL_DAILY="false" REPORT_RUNTIME_AUDIT="$PIPELINE_AUDIT_SUMMARY" go run "./scripts/generate_daily_report.go"; then
  record_failure "日报生成失败"
  exit 1
 fi
--- a/scripts/verify_phase3.sh
+++ b/scripts/verify_phase3.sh
@@ -19,6 +19,10 @@ check_executable "scripts/feishu_alert.sh" "飞书告警脚本可执行"
 check_shell "日报生成器可独立构建" "go build -o /dev/null ./scripts/generate_daily_report.go"
 check_shell "日报脚本包含降级逻辑" "grep -q 'fallback_report' scripts/run_daily.sh"
 check_shell "日报脚本包含飞书告警逻辑" "grep -q 'send_alert' scripts/run_daily.sh"
+check_shell "正式调度链启用严格真实采集" "grep -q -- '-strict-real' scripts/run_daily.sh && grep -q -- '-strict-real' scripts/run_real_pipeline.sh"
+check_shell "正式调度链校验本次采集结果数量" "grep -q '本次采集结果异常' scripts/run_daily.sh && grep -q 'total=' scripts/run_real_pipeline.sh"
+check_shell "每日流水线已纳入多源补充同步" "grep -q 'fetch_multi_source.go --sources moonshot,deepseek,openai' scripts/run_daily.sh && grep -q 'import_zhipu_data.go' scripts/run_daily.sh && grep -q 'import_phase2_data.go' scripts/run_daily.sh && grep -q 'import_bytedance_data.go' scripts/run_daily.sh"
+check_shell "每日流水线会把来源级运行审计写入正式日报上下文" "grep -q 'REPORT_RUNTIME_AUDIT' scripts/run_daily.sh && grep -q 'selected_source_keys=' scripts/run_daily.sh && grep -q 'failed_source_keys=' scripts/run_daily.sh"
 check_shell "今日日报 Markdown 主产物存在且包含数据质量摘要" "test -f ${TODAY_MARKDOWN_PATH} && grep -q '数据质量摘要' ${TODAY_MARKDOWN_PATH}"
 check_shell "今日日报 HTML 主产物存在" "test -f ${TODAY_HTML_PATH}"
 check_shell "今日日报归档副本存在（Markdown + HTML）" "test -f ${TODAY_ARCHIVE_MARKDOWN_PATH} && test -f ${TODAY_ARCHIVE_HTML_PATH}"
--- a/scripts/verify_phase5.sh
+++ b/scripts/verify_phase5.sh
@@ -16,6 +16,7 @@ check_executable "scripts/backup.sh" "数据库备份脚本可执行"
 check_file "healthcheck.sh" "健康检查脚本存在"
 check_file "scripts/restore.sh" "数据库恢复脚本存在"
 check_shell "Makefile 暴露真实流水线与总门禁入口" "grep -q '^run-real-pipeline:' Makefile && grep -q '^verify-phase1:' Makefile && grep -q '^verify-phase6:' Makefile && grep -q '^verify-pre-phase6:' Makefile"
+check_shell "真实流水线包含多源调度与来源级运行审计" "grep -Eq 'fetch_multi_source\\.go\"? --sources moonshot,deepseek,openai' scripts/run_real_pipeline.sh && grep -q 'REPORT_RUNTIME_AUDIT' scripts/run_real_pipeline.sh && grep -q 'failed_source_keys=' scripts/run_real_pipeline.sh"
 check_shell "部署文档覆盖 Docker、前端启动与 cron 配置" "grep -q 'docker build' DEPLOYMENT.md && grep -q 'npm run dev' DEPLOYMENT.md && grep -q 'crontab -e' DEPLOYMENT.md"
 check_shell "健康检查脚本覆盖数据库与日报可用性" "grep -q 'psql' healthcheck.sh && grep -q 'reports/daily/daily_report_' healthcheck.sh"
 check_shell "备份恢复脚本具备 PostgreSQL 入口" "grep -Eq 'pg_dump|psql' scripts/backup.sh && grep -Eq 'psql|pg_restore' scripts/restore.sh"