596 lines
16 KiB
Markdown
596 lines
16 KiB
Markdown
# SPEC: Batch Auto-Import by URL + Key (v2)
|
||
|
||
日期:2026-05-21
|
||
技术架构:`docs/2026-05-22-BATCH_AUTO_IMPORT_V2_ARCHITECTURE.md`
|
||
|
||
## 1. Objective
|
||
|
||
V2 的目标不是“又一条导入命令”,而是把这件事做成**稳定、可恢复、可追踪**的控制面能力:
|
||
|
||
1. **上游发现**:基于 `(base_url, api_key)` 自动发现模型,而不是默认信任人工输入
|
||
2. **模型纠错**:自动归一化、别名匹配、推荐正确模型名
|
||
3. **兼容画像**:记录每个上游和每个模型的兼容能力,避免重复踩坑
|
||
4. **宿主演化**:自动创建/更新 channel、account、provider binding
|
||
5. **异步确认**:吸收宿主异步 probe、首次 `403/503` 的预热窗口
|
||
6. **闭环验证**:以宿主网关真实 `/v1/chat/completions` 结果作为最终可用性判断
|
||
7. **结果可视**:提供 run 列表、run 详情、item 详情,而不是只靠日志和 artifact
|
||
8. **重复导入复用**:已成功导入且模型已覆盖的 provider,再次添加时应自动复用,而不是重复创建
|
||
|
||
## 2. Scope
|
||
|
||
V2 在现有 v1 pack-based 路径旁边新增一条**URL + key auto-import** 路径。
|
||
|
||
### In Scope
|
||
|
||
- 批量输入 `(base_url, api_key, requested_models?)`
|
||
- 兼容 `subscription` 和 `self_service`
|
||
- 运行态状态持久化
|
||
- 后台异步确认与有限重试
|
||
- 最小结果 API 与结果页
|
||
- 模型纠错与 capability profile 持久化
|
||
|
||
### Out of Scope
|
||
|
||
- 多 key 自动负载均衡
|
||
- 宿主数据库直连
|
||
- 自动价格发现/自动调价
|
||
- 实时 WebSocket 推送
|
||
- 复杂工作台前端
|
||
|
||
## 3. Canonical Contract
|
||
|
||
V2 从这里开始只认一套 canonical contract,后续所有文档、API、页面、状态库都必须遵循这套命名。
|
||
|
||
### 3.1 ID 规则
|
||
|
||
- `run_id`:一次批量导入任务 ID,字符串
|
||
- `item_id`:run 内单条导入记录 ID,字符串
|
||
- `provider_id`:`{normalized_host}-{url_hash_last8}`
|
||
|
||
`provider_id` 生成规则:
|
||
|
||
1. 取完整 `base_url` 规范化后参与计算,不能只取 host
|
||
2. `normalized_host` 用于可读性
|
||
3. `url_hash_last8` 用于区分同 host 不同 path
|
||
|
||
示例:
|
||
|
||
- `https://api.deepseek.com/v1` → `api-deepseek-9f31c2ab`
|
||
- `https://api.deepseek.com/proxy/v1` → `api-deepseek-4a2d88f1`
|
||
|
||
### 3.2 Run 级状态
|
||
|
||
`run.state` 固定为:
|
||
|
||
- `running`
|
||
- `completed`
|
||
- `completed_with_warnings`
|
||
- `failed`
|
||
- `cancelled`
|
||
|
||
说明:
|
||
|
||
- `completed_with_warnings` 是 run 级总状态
|
||
- 页面可以显示成黄色 badge `warning`
|
||
- 但 API/状态库里一律写全量枚举值 `completed_with_warnings`
|
||
|
||
### 3.3 Item 级状态
|
||
|
||
`item.current_stage` 固定为:
|
||
|
||
- `probe`
|
||
- `provision`
|
||
- `confirm`
|
||
- `validate`
|
||
- `done`
|
||
|
||
`item.confirmation_status` 固定为:
|
||
|
||
- `pending`
|
||
- `confirmed`
|
||
- `advisory`
|
||
- `failed`
|
||
|
||
`item.access_status` 固定为:
|
||
|
||
- `unknown`
|
||
- `active`
|
||
- `degraded`
|
||
- `broken`
|
||
|
||
约束:
|
||
|
||
- `confirmation_status` 只描述“宿主异步窗口是否已确认稳定”
|
||
- `access_status` 只描述“最终网关真实可用性”
|
||
- `Validation Engine` 是 `access_status` 的唯一写入方
|
||
|
||
### 3.4 Legacy 兼容规则
|
||
|
||
V1 的 `import_batches` / `import_batch_items` / `managed_resources` 继续保留,但在 V2 中:
|
||
|
||
- 仅作为 legacy execution evidence 或资源关联来源
|
||
- 不再作为结果页主数据源
|
||
- V2 结果页/API 只读 `import_runs` / `import_run_items` / `import_run_item_events`
|
||
|
||
## 4. Request / Result Contract
|
||
|
||
### 4.1 Batch Import Request
|
||
|
||
```text
|
||
BatchImportRunRequest
|
||
- host_id: string
|
||
- mode: "strict" | "partial"
|
||
- access_mode: "subscription" | "self_service"
|
||
- confirm_wait_timeout_sec: int # CLI/HTTP 可选等待时间
|
||
- entries: []BatchImportEntry
|
||
- subscription_users: []string # access_mode=subscription 必填
|
||
- subscription_days: int # access_mode=subscription 必填
|
||
- probe_api_key: string # access_mode=self_service 必填
|
||
|
||
BatchImportEntry
|
||
- base_url: string
|
||
- api_key: string
|
||
- requested_models: []string # 可选,仅作为提示
|
||
```
|
||
|
||
### 4.2 Access Mode 必填规则
|
||
|
||
`subscription`:
|
||
|
||
- 必填:`subscription_users`
|
||
- 必填:`subscription_days`
|
||
- 不接受只写 `access_mode=subscription` 但不带订阅目标
|
||
|
||
`self_service`:
|
||
|
||
- 必填:`probe_api_key`
|
||
- `probe_api_key` 用于最终 gateway access validation
|
||
|
||
### 4.3 Batch Import Result
|
||
|
||
```text
|
||
BatchImportRunResult
|
||
- run_id: string
|
||
- state: string
|
||
- total_items: int
|
||
- active_items: int
|
||
- degraded_items: int
|
||
- broken_items: int
|
||
- warning_items: int
|
||
- result_page: string
|
||
```
|
||
|
||
### 4.4 Item Projection
|
||
|
||
```text
|
||
BatchImportRunItemView
|
||
- item_id: string
|
||
- base_url: string
|
||
- provider_id: string
|
||
- api_key_fingerprint: string
|
||
- requested_models: []string
|
||
- raw_models: []string
|
||
- normalized_models: []string
|
||
- canonical_model_families: []string
|
||
- resolved_smoke_model: string | null
|
||
- recommended_models: []string
|
||
- current_stage: string
|
||
- confirmation_status: string
|
||
- access_status: string
|
||
- matched_account_state: string
|
||
- account_resolution: string
|
||
- retry_count: int
|
||
- last_retry_at: string | null
|
||
- advisory_messages: []string
|
||
- last_error_stage: string | null
|
||
- last_error: string | null
|
||
- channel_id: int64 | null
|
||
- account_id: int64 | null
|
||
- provision_reused: bool
|
||
- reused_from_provider_id: string | null
|
||
- reused_from_account_id: int64 | null
|
||
- capability_profile: object
|
||
```
|
||
|
||
## 5. Core Pipeline
|
||
|
||
### 5.1 Five-stage pipeline
|
||
|
||
```text
|
||
Stage 0: Run Setup
|
||
create import_run + import_run_items
|
||
persist operator input
|
||
|
||
Stage 1: Probe
|
||
/v1/models
|
||
capability probe
|
||
completion smoke
|
||
normalize aliases
|
||
|
||
Stage 2: Provision
|
||
find/create channel
|
||
patch model_mapping + model_pricing + restrict_models + billing_model_source
|
||
create/update account
|
||
persist managed resource link
|
||
|
||
Stage 3: Confirm
|
||
background confirmer absorbs async probe race / warmup window
|
||
writes confirmation_status
|
||
|
||
Stage 4: Validate
|
||
host gateway real /v1/chat/completions
|
||
writes final access_status
|
||
|
||
Stage 5: Project
|
||
update run summary
|
||
serve result API / pages
|
||
```
|
||
|
||
### 5.2 Ownership boundaries
|
||
|
||
- `Probe Layer` 负责发现和分类,不决定最终 `access_status`
|
||
- `Provision Adapter` 负责创建/更新宿主资源
|
||
- `Confirmation Engine` 负责把瞬时 `403/503` 吸收到 `pending/advisory/failed`
|
||
- `Validation Engine` 负责最终 `access_status`
|
||
- `Result Projection` 负责把状态库转换成页面/API 视图
|
||
|
||
## 6. Capability Profile
|
||
|
||
### 6.1 为什么要分两层
|
||
|
||
真实场景里兼容能力不是“一个 key 一个总画像”就能表达清楚的。必须拆成:
|
||
|
||
1. **transport profile**:这个 upstream 支不支持 `/models`、`/chat/completions`、`/responses`、`/messages`
|
||
2. **model profiles**:这个 upstream 下的具体模型,在 stream/tools/reasoning 字段上是否可用
|
||
|
||
### 6.1.1 为什么还要有 canonical model family
|
||
|
||
不同中转对同一个模型的命名可能有轻微差异,但 API 和能力集本质一致,例如:
|
||
|
||
- `kimi 2.6`
|
||
- `kimi-2.6`
|
||
- `kimi-k2.6`
|
||
- `Kimi-K2.6`
|
||
|
||
V2 不能把这些名字当成完全不同的模型,而要继续归并到同一个 `canonical_model_family`,用于:
|
||
|
||
- 重复导入复用判断
|
||
- 模型覆盖判断
|
||
- 别名 patch 判断
|
||
- 推荐模型名输出
|
||
|
||
### 6.2 Canonical schema
|
||
|
||
```json
|
||
{
|
||
"transport_profile": {
|
||
"supports_openai_models": true,
|
||
"supports_openai_chat_completions": true,
|
||
"supports_openai_responses": false,
|
||
"supports_anthropic_messages": false,
|
||
"auth_style": "bearer",
|
||
"model_id_style": "vendor_prefixed",
|
||
"known_advisories": [
|
||
"responses_unsupported_but_chat_ok",
|
||
"initial_probe_race_expected"
|
||
]
|
||
},
|
||
"model_profiles": [
|
||
{
|
||
"raw_model_id": "deepseek-ai/DeepSeek-V4-Pro",
|
||
"normalized_model_id": "deepseek-v4-pro",
|
||
"canonical_model_family": "deepseek-v4-pro",
|
||
"supports_stream": true,
|
||
"supports_tools": "unknown",
|
||
"supports_reasoning_fields": "unknown",
|
||
"smoke_chat_ok": true
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
### 6.3 用途
|
||
|
||
- 决定是否跳过 `/responses`
|
||
- 决定是否直接走 raw `/chat/completions`
|
||
- 决定 warning 文案
|
||
- 决定推荐 smoke model
|
||
- 决定后续快速匹配“哪个模型在哪种兼容层下靠谱”
|
||
|
||
### 6.4 Canonical model family 规则
|
||
|
||
V2 对模型名做三层处理:
|
||
|
||
1. `raw_model_id`
|
||
2. `normalized_model_id`
|
||
3. `canonical_model_family`
|
||
|
||
示例:
|
||
|
||
| raw_model_id | normalized_model_id | canonical_model_family |
|
||
|---|---|---|
|
||
| `kimi 2.6` | `kimi-2.6` | `kimi-2.6` |
|
||
| `kimi-k2.6` | `kimi-k2.6` | `kimi-2.6` |
|
||
| `Kimi-K2.6` | `kimi-k2.6` | `kimi-2.6` |
|
||
| `deepseek-ai/DeepSeek-V4-Pro` | `deepseek-v4-pro` | `deepseek-v4-pro` |
|
||
|
||
约束:
|
||
|
||
- `canonical_model_family` 用于跨中转识别“是否同一个模型族”
|
||
- `normalized_model_id` 用于控制面和 channel 落盘
|
||
- `raw_model_id` 用于保留 upstream 原始路由
|
||
|
||
## 7. Existing Provider Reuse / Idempotent Re-import
|
||
|
||
### 7.1 目标
|
||
|
||
如果某个 provider 已成功导入,且现有模型族已覆盖本次请求模型,则再次添加时应:
|
||
|
||
- 不重复创建 channel/account/provider
|
||
- 直接复用既有成功链路
|
||
- 必要时仅 patch 新 alias / 新模型映射
|
||
|
||
### 7.2 预检查顺序
|
||
|
||
每个 item 在 Stage 2 前必须按顺序执行:
|
||
|
||
1. 按 `host_id + provider_id` 查现有 provider
|
||
2. 按 `host_id + base_url + api_key_fingerprint` 查现有 account
|
||
3. 比较:
|
||
- `canonical_model_families`
|
||
- `normalized_models`
|
||
- 既有 `access_status`
|
||
- 既有账号健康状态
|
||
|
||
### 7.3 决策表
|
||
|
||
| 场景 | 行为 |
|
||
|---|---|
|
||
| provider 已存在,`access_status=active`,且既有 `canonical_model_families` 覆盖本次请求 | 直接复用,不再 provision |
|
||
| 命中现有 account,且账号状态为 `active` | 标记为重复已启用账号,直接复用并提示 `duplicate_active_account` |
|
||
| 命中现有 account,且账号状态为 `disabled` 或 `deprecated`,但 key 仍健康 | 走 `reactivated` 路径,快速启用已有账号,不新建账号 |
|
||
| provider 已存在,账号健康,但只缺少部分 alias / mapping | 只 patch,不重建 |
|
||
| provider 已存在,但 key 已失效或 `access_status=broken` | 不复用,进入 repair/replace |
|
||
| 同 host 同 URL,但 access_mode 不同 | 不直接复用 access 结果,按 mode 分别确认 |
|
||
|
||
### 7.4 复用后的 item 投影
|
||
|
||
若命中复用,item 仍要生成新的 V2 记录,并写明:
|
||
|
||
- `provision_reused = true`
|
||
- `reused_from_provider_id`
|
||
- `reused_from_account_id`
|
||
- `matched_account_state`
|
||
- `account_resolution`
|
||
|
||
### 7.4.1 已存在账号的处理原则
|
||
|
||
V2 必须同时回答两件事:
|
||
|
||
1. 这次 provider 是否被复用
|
||
2. 命中的既有账号当前是什么状态
|
||
|
||
对于 `host_id + base_url + api_key_fingerprint` 命中的账号:
|
||
|
||
- `active`
|
||
- 不重复创建账号
|
||
- `matched_account_state=active`
|
||
- `account_resolution=reused`
|
||
- UI 文案显示“重复,已启用”
|
||
- `disabled` / `deprecated`
|
||
- 优先尝试启用已有账号
|
||
- `matched_account_state=disabled|deprecated`
|
||
- `account_resolution=reactivated`
|
||
- UI 文案显示“已弃用,已快速启用”
|
||
- `broken`
|
||
- 不直接复用
|
||
- `matched_account_state=broken`
|
||
- `account_resolution=replaced`
|
||
- 进入 repair/replace 流程
|
||
|
||
### 7.5 Key fingerprint
|
||
|
||
V2 不以原始 key 字符串作为重复匹配依据,而保存:
|
||
|
||
- `api_key_fingerprint`
|
||
|
||
用于区分:
|
||
|
||
- 同一把 key 的重复导入
|
||
- 同 URL 下新增另一把 key
|
||
|
||
## 8. Channel / Account Evolution Contract
|
||
|
||
V2 不再使用“薄 patch 接口”表达 channel 更新。宿主 patch 必须以完整 contract 表达:
|
||
|
||
```text
|
||
ChannelPatchContract
|
||
- model_mapping: map[string]string
|
||
- model_pricing: map[string]PriceSpec
|
||
- restrict_models: true
|
||
- billing_model_source: "channel_mapped"
|
||
```
|
||
|
||
约束:
|
||
|
||
- `model_mapping` 同时记录 raw → canonical
|
||
- `model_pricing` 默认可填零值,但字段必须完整存在
|
||
- patch 不得破坏旧模型
|
||
- `PatchChannel(addModels []string)` 这类接口不再作为 V2 canonical contract
|
||
|
||
## 9. Async Confirmation Mechanism
|
||
|
||
### 8.1 为什么 V2 必须有后台 confirmer
|
||
|
||
V2 的稳定性目标不能建立在“请求线程里顺序 sleep + retry”。必须有独立后台机制推进:
|
||
|
||
- `confirming` item
|
||
- 因 probe race 暂时 advisory 的 item
|
||
- 因 `503 no available accounts` 等待预热的 item
|
||
|
||
### 8.2 Canonical executor
|
||
|
||
V2 必须实现 `ConfirmationWorker`:
|
||
|
||
```text
|
||
ConfirmationWorker
|
||
- poll import_run_items where current_stage='confirm'
|
||
- condition: next_retry_at <= now
|
||
- acquire lease
|
||
- run confirm logic
|
||
- update item state
|
||
- release lease
|
||
```
|
||
|
||
### 8.3 必需字段
|
||
|
||
`import_run_items` 至少要有:
|
||
|
||
- `confirmation_attempts`
|
||
- `retry_count`
|
||
- `last_retry_at`
|
||
- `next_retry_at`
|
||
- `lease_owner`
|
||
- `lease_until`
|
||
|
||
### 8.4 Restart safety
|
||
|
||
V2 第一版即要求:
|
||
|
||
- 进程重启后 unfinished confirm item 会被 worker 重新拾取
|
||
- 页面能看到 item 停在哪个阶段
|
||
- CLI `--confirm-wait-timeout` 只是“等待窗口”,不是确认机制本身
|
||
|
||
## 10. Single Source of Truth
|
||
|
||
### 9.1 Canonical runtime tables
|
||
|
||
V2 运行态只认三类表:
|
||
|
||
- `import_runs`
|
||
- `import_run_items`
|
||
- `import_run_item_events`
|
||
|
||
### 9.2 Legacy linkage
|
||
|
||
若某个 V2 item 调用了现有 v1 provision 流程,可在 item 上保留:
|
||
|
||
- `legacy_batch_id`
|
||
- `legacy_provider_id`
|
||
|
||
但这些字段仅作为追溯链接,不能替代 V2 状态源。
|
||
|
||
### 9.3 Result page data source
|
||
|
||
结果页/API 只读 V2 canonical tables,不直接拼接:
|
||
|
||
- `import_batches`
|
||
- `probe_results`
|
||
- `access_closure_records`
|
||
- 宿主数据库
|
||
|
||
## 11. Result API and Pages
|
||
|
||
### 10.1 API
|
||
|
||
V2 标准 API:
|
||
|
||
```text
|
||
POST /api/batch-import/runs
|
||
GET /api/batch-import/runs
|
||
GET /api/batch-import/runs/{run_id}
|
||
GET /api/batch-import/runs/{run_id}/items
|
||
GET /api/batch-import/runs/{run_id}/items/{item_id}
|
||
```
|
||
|
||
Legacy API `/api/import-batches/*` 保留,但标为 v1/legacy。
|
||
|
||
### 10.2 Pages
|
||
|
||
```text
|
||
/batch-import/runs
|
||
/batch-import/runs/{run_id}
|
||
```
|
||
|
||
结果页必须能直接回答:
|
||
|
||
- 哪条 URL 导入成功
|
||
- 哪条卡在 `probe/provision/confirm/validate`
|
||
- 哪条发生模型纠错
|
||
- 哪条是 advisory 而不是 broken
|
||
- 重试过几次
|
||
- 当前 warning 的原因是什么
|
||
|
||
## 12. CLI Contract
|
||
|
||
```bash
|
||
go run ./cmd/cli batch-import \
|
||
--host-id "<host_id>" \
|
||
--entry "https://example.com/v1,sk-xxx" \
|
||
--batch-file "./keys.csv" \
|
||
--mode "strict|partial" \
|
||
--access-mode "subscription|self_service" \
|
||
--subscription-users "u1,u2" \
|
||
--subscription-days 30 \
|
||
--probe-api-key "<user_gateway_key>" \
|
||
--confirm-wait-timeout 15s
|
||
```
|
||
|
||
CLI 输出必须至少包含:
|
||
|
||
- `run_id`
|
||
- `result_page`
|
||
- 每个 entry 的 `resolved_smoke_model`
|
||
- capability 摘要
|
||
- `confirmation_status`
|
||
- `access_status`
|
||
- 推荐模型名(若发生纠错)
|
||
|
||
## 13. Error Policy
|
||
|
||
### Blocking
|
||
|
||
- `401/403 unauthorized` 且证据表明 key 无效
|
||
- `/v1/models` 完全不可用且无替代路径
|
||
- provision 明确失败
|
||
|
||
### Advisory
|
||
|
||
- 第三方 upstream `/responses=403` 但 `/chat/completions=200`
|
||
- 首次 `/accounts/:id/test=403`,但 probe race 已被识别
|
||
- 首次 `/v1/chat/completions=503 no available accounts`,且重试后恢复
|
||
- `429 rate_limit`
|
||
|
||
### Access status ownership
|
||
|
||
- `confirmation_status=advisory` 不自动等于 `access_status=degraded`
|
||
- 只有 Validation Engine 可以把 item 标成 `active/degraded/broken`
|
||
|
||
## 14. Success Criteria
|
||
|
||
1. `access_mode` 输入契约完整,`subscription` / `self_service` 都可单独落地
|
||
2. run / item 状态、重试、warning、错误阶段能持久化并在重启后恢复可见
|
||
3. 结果页和 API 只读 V2 canonical tables
|
||
4. 模型纠错结果、capability profile、推荐模型名可追溯
|
||
5. 第三方兼容 upstream 的 `/responses` 误判和宿主异步窗口不会把可用链路直接打成最终失败
|
||
6. 页面可以清楚地区分 `confirmed/advisory/failed` 与 `active/degraded/broken`
|
||
7. OpenAPI、SPEC、TDD、Architecture 对同一字段和同一状态枚举保持一致
|
||
8. 已成功导入的 provider 再次添加时,若模型族已覆盖,应自动复用,不重复创建
|
||
9. 同模型在不同中转下的轻微命名差异,能通过 `canonical_model_family` 快速识别为同一模型族
|
||
|
||
## 15. Non-goals for first implementation
|
||
|
||
- 多 key 自动调度
|
||
- 实时推送
|
||
- 自动定价策略
|
||
- 自动负载均衡
|
||
|
||
## 16. Final decisions
|
||
|
||
1. `provider_id` 采用 `normalized_host + url_hash_last8`
|
||
2. `requested_models` 仅作提示,不作为事实源
|
||
3. `Validation Engine` 是 `access_status` 唯一写入方
|
||
4. V2 runtime canonical tables 为 `import_runs/import_run_items/import_run_item_events`
|
||
5. `ConfirmationWorker` 是 V2 必备组件,不是可选增强
|
||
6. 同模型跨中转匹配以 `canonical_model_family` 为准,而不是只看原始模型名
|