- Add migration 0004 to introduce 'claiming' status and timeout index
- Add StatusClaiming to platformevent domain and allow it in Validate()
- Rewrite ListDue as transactional UPDATE ... RETURNING with FOR UPDATE SKIP LOCKED
- Add ReleaseStaleClaims to reset expired claiming events back to retrying
- Worker Start() now runs a 30s ticker for stale claim recovery (5m timeout)
- Update stubEventStore in tests to satisfy new EventStore interface
Refs: D-02
D-03: document non-transactional boundaries.
- Comment in platform_webhook_handler.go explaining that dialog.Process
and outbox.InsertPendingBatch are not in a single transaction; 500 is
returned on outbox failure for caller retry.
- Package-level comment in dialog/service.go noting the lack of a unified
transactional outer box and the eventually-consistent nature of storage
operations.
D-01: callback_target contract drift cleanup.
- Remove CallbackTarget from Event struct and Validate
- Remove CallbackTarget from PlatformInboundMeta
- Remove defaultCallbackTarget and assignment from builder
- Remove callback_target column from INSERT/SELECT/dead_letter SQL
- Clean up all test literals and assertions
DB migration left untouched; column remains empty until a future schema
cleanup migration.
Fixes 'invalid input syntax for type uuid' error when writing ticket
workflow audit logs. The audit Event.ID field was using fmt.Sprintf
with nanoseconds ('wf-%d') which doesn't match PostgreSQL's uuid type.
Also adds uuid import to ticket_workflow.go.
Verified: full chain webhook→assign→resolve→close produces 3 audit
logs correctly, no more 'invalid uuid' errors in logs.
- Replace fmt.Sprintf with sess.ID+nanotime that generated non-UUID strings
- ticket creation and audit logging now use github.com/google/uuid
- Fixes 500 error when webhook processes messages with PG store
- All 23/23 tests pass, verified Gate B end-to-end
1. config.go: AI_CS_ENV runtime mode with production restriction
- New RuntimeConfig.Env field (AI_CS_ENV / AI_CS_RUNTIME_ENV)
- production + Postgres.Enabled=false → Load() returns error
- production + empty webhook secret → Load() returns error
- normalizeRuntimeEnv: dev/dev/ → development, prod/production → production, test → test
2. app.go: probe.SetReady only when store is confirmed ready
- Postgres.Enabled: probe.SetReady(true) after DB+migration OK
- Memory mode: probe.SetReady(false) — not production-ready
3. health_handler_test.go: add probe live+ready state transition tests
4. config_test.go: add TestLoad_RejectsProdWhenPostgresDisabled,
TestLoad_RejectsProdWhenWebhookSecretMissing
5. app_test.go: add TestNew_RejectsMemoryModeWithoutExplicitNonProdEnv,
TestNew_AllowsMemoryModeInTestEnv, TestNew_WithPostgresEnabled_*
for invalid DSN and migration-failure paths
Phase 1 (code gate) objectives met:
✅ prod cannot fall back to memory store
✅ readiness reflects actual store readiness
✅ both changes have test coverage