fix fresh-host acceptance and document real-host debugging learnings

This commit is contained in:
phamnazage-jpg
2026-05-21 21:19:19 +08:00
parent 7c6e18f94d
commit 3ba3244ea6
85 changed files with 1721 additions and 162 deletions

View File

@@ -5,19 +5,25 @@
## 当前 Gate 结论
当前最新 gate`BLOCKED`
当前最新 gate`APPROVED`
原因不是基础导入链路仍未打通,而是:
1. latest-head fresh host 上DeepSeek / MiniMax 的 import + access closure 已能进入 `subscription_ready`
2. account `credentials.model_mapping`、channel `model_mapping/model_pricing`、managed key 视角 `/v1/models` 都已有 live 证据
3. 但 latest completion smoke 仍未完全通过:
- DeepSeekhost `/v1/chat/completions` 仍见 `502`,而上游直探 `200`
- MiniMax:上游直探为 `403 insufficient_user_quota`
4. 因此当前不能宣称 `APPROVED`
当前 gate 升到 `APPROVED` 的原因是:
1. 代码侧已关闭“只靠 `/v1/models` 就把 access 标成 ready”的假阳性当前 ready 必须同时通过 `/v1/models``/v1/chat/completions` smoke
2. `scripts/import_remote43_provider.sh` 已补上 upstream `/models``/chat/completions` 直探,并落盘 `21-summary.json` 做根因分类
3. account `credentials.model_mapping`、channel `model_mapping/model_pricing`、managed key 视角 `/v1/models` 都已有 live 证据
4. completion-gated 补丁已经在 fresh-host 上重跑验证通过control plane 会把 completion 失败正确落成 `broken`
5. MiniMax account probe 假失败也已被最新补丁关闭:
- `internal/host/sub2api/accounts.go` 现在会正确解析 SSE `type=error` 事件,不再吞掉真实错误 message
- `internal/provision/import_service.go` 与 reconcile rerun 现在会显式向 `/api/v1/admin/accounts/:id/test``provider.SmokeTestModel`,不再让宿主默认回退测试 `gpt-5.4`
- 最新证据:`artifacts/real-host-acceptance/20260521_191418_remote43_minimax_key_import/21-summary.json` 已显示 `batch_status=succeeded``provider_status=active`
6. DeepSeek 2166 与 MiniMax 53hk 两条 `subscription` provider 分支都已完成 latest fresh-host 复验,最新证据分别是 `artifacts/real-host-acceptance/20260521_201509_remote43_deepseek_key_import/21-summary.json``artifacts/real-host-acceptance/20260521_191418_remote43_minimax_key_import/21-summary.json`
7. latest-head `self_service` 标准 fresh-host 验收 `artifacts/real-host-acceptance/20260521_210403` 也已通过:`05-import.json` = `succeeded/self_service_ready/active``07-access-status.json` = `latest_access_status=fully_ready`
8. 当前仍存在的 `reconcile=drifted` 仅反映共享 fresh-host 历史残留资源,不阻塞 PRD 首版放行
一句话:
- “模型暴露与 access closure 已基本打通”是真
-真实 completion 可完全放行”还不是真
- “模型暴露、completion gate 和 upstream triage 都已进代码”是真
-MiniMax 53hk、DeepSeek 2166 的 `subscription` 真实宿主主链路已完全放行”是真
- “latest-head `self_service` fresh-host 标准验收也已通过”是真
## 当前真相文档(按优先级排序)
@@ -106,9 +112,13 @@
### 当前优先证据
优先看最新一轮、且与 latest-head / fresh host 对齐的 artifact
- `artifacts/real-host-acceptance/20260520_222713_crm18100_live_model_mapping_validation`
- `artifacts/real-host-acceptance/20260521_011544_remote43_minimax_key_import`
- `artifacts/real-host-acceptance/20260521_011717_remote43_deepseek_key_import`
- `artifacts/real-host-acceptance/20260521_064910_completion_smoke_calibration.md`
- `artifacts/real-host-acceptance/20260521_191418_remote43_minimax_key_import`
- `artifacts/real-host-acceptance/20260521_201509_remote43_deepseek_key_import`
- `artifacts/real-host-acceptance/20260521_210403`
说明:
- 上述 artifact 已包含 patched control plane 的最新 live 证据。
- 它们证明 current-code 的 `subscription``self_service` 主链路已经在 fresh host 上闭环通过;其中 `20260521_210403` 还补齐了标准 `reconcile/rollback` 验收链路。
### 历史参考证据
以下可证明某个阶段“曾经打通过”,但不能直接代表当前真相:
@@ -144,6 +154,10 @@
- account 视角
- managed key / 普通用户 `/v1/models` 视角
- completion smoke 视角
5. remote43/provider 验收脚本当前还必须补看:
- upstream `/models`
- upstream `/chat/completions`
- `21-summary.json`
## 推荐阅读顺序