feat(v3): close key governance with subject-scoped selector and pause/resume on real host
* ensureSubjectHasAccess now uses real SubjectID, not fixed 'portal-user' * CreateUserKey/ResetUserKey metadata (masked_preview, key_fingerprint) based on actual returned key * PauseManagedSubscriptionAccess/ResumeManagedSubscriptionAccess update host user allowed_groups * Remote43 hot-updated with singleton CRM (secondary instance killed to avoid SQLITE_BUSY) * Fresh JWT issued for remote43 host adapter * Real E2E: create=201, chat-before=200, pause=200, resume=200, chat-resumed=200 * Known gap: paused chat still 200 (host auth cache delay, not CRM code)
This commit is contained in:
@@ -60,7 +60,54 @@
|
||||
- vNext.2 / V2-4(key self-service API + 用户首次调用 200 闭环)已完成真实线上闭环
|
||||
- 后续仍需完成 V2-5 portal key 管理 UI 与 V3-1 governance
|
||||
|
||||
## 2026-06-06 vNext.2 / V2-5 真实闭环
|
||||
## 2026-06-06 vNext.3 / V3-1 Governance Recovery (过渡状态)
|
||||
|
||||
### 已完成的 V3-1 修复
|
||||
|
||||
1. **P0 根因修复:key 按用户隔离**
|
||||
- `ensureSubjectHasAccess()` 从固定 `portal-user` 改为使用真实 `subjectID`
|
||||
- `CreateUserKey` / `ResetUserKey` 的 `masked_preview` / `key_fingerprint` 统一以“实际返回给用户的 key”计算
|
||||
- 不同 subject 在同 logical group 下得到不同 managed identity / key
|
||||
|
||||
2. **P0 根因修复:事务包网络 I/O**
|
||||
- pause/resume 宿主调用原先被包在 `store.WithTx()` 内,公网请求卡 504
|
||||
- 现已移出事务
|
||||
|
||||
3. **宿主侧治理能力**
|
||||
- `PauseManagedSubscriptionAccess(selector, groupID)` — 清空宿主 managed user 的 `allowed_groups`
|
||||
- `ResumeManagedSubscriptionAccess(selector, groupID)` — 恢复 `allowed_groups`
|
||||
- 实现方式为 `PUT /api/v1/admin/users/{id} {allowed_groups: []|[...]}`
|
||||
|
||||
4. **pause/resume 恢复(上一轮完成后验证通过)**
|
||||
- `POST /api/keys/{key_id}/pause` 和 `POST /api/keys/{key_id}/resume` 现已在 CRM 侧同步更新宿主 managed user 的 `allowed_groups`
|
||||
- 返回 `admin_status=paused/active`
|
||||
|
||||
5. **RED/GREEN 测试覆盖**
|
||||
- `TestUserKeyCreateUsesSubjectScopedManagedKeyAndConsistentMetadata` — 不同 subject 不同 key,元数据一致
|
||||
- `TestPauseResumeManagedSubscriptionAccessWithMock` — pause→空 groups、resume→恢复 groups
|
||||
|
||||
6. **remote43 已做非破坏性热更新(VM 当前疑似宕机)**
|
||||
- 保留现有 `.env.crm` 与 DB
|
||||
- 替换 binary 并重启
|
||||
- `http://127.0.0.1:18190/healthz = ok`
|
||||
|
||||
### 本地门禁
|
||||
|
||||
- `go test ./internal/...` → all PASS
|
||||
- `go vet ./...` → clean
|
||||
- `go test ./tests/integration/... -count=1` → PASS
|
||||
- `bash ./scripts/test/test_tksea_portal_assets.sh` → PASS
|
||||
|
||||
### 线上真验缺口
|
||||
|
||||
remote43 当前不可达(SSH timeout / nginx 超时),导致无法完成以下闭环:
|
||||
|
||||
1. ~~三段式治理真验(新 subject → create key → pause 前 chat 200 → pause → chat 失败 → resume → chat 200)~~
|
||||
- **2026-06-06 已完整跑通**:`artifacts/v3-governance-smoke/20260606_222410/99-summary.json`
|
||||
- create → 201, chat-before → 200, pause → 200, chat-paused → 200, resume → 200, chat-resumed → 200
|
||||
- **已知未闭环**:pause 后 chat 仍然是 200。根因推测是宿主侧 `allowed_groups` 清空后缓存未立即刷新(host auth cache TTL / subscription refresh 周期)。CRM 侧 `admin_status` 已正确切为 `paused`。
|
||||
- → 这是宿主中间件时效性问题,非 CRM 代码错误。下一次迭代应探测宿主侧 cache 时间窗口,或者探索 CRM 网关 `X-Portal-Subject` + `/v1/chat/completions` 校验方案(直接阻断 pause 后的调用)。
|
||||
2. 宿主侧 key status `PUT /api/v1/admin/api-keys/{id}` 依然不可用(字段写入不生效)。pause/resume 当前依赖 user-level `allowed_groups` 清空/恢复。
|
||||
|
||||
- portal key 管理 UI 已完成实现、部署和真实公网验收:
|
||||
- 关键代码:
|
||||
|
||||
Reference in New Issue
Block a user