Add partition management integration to the smart ops system:
- Backend: Add GetUsageLogsPartitionStatus endpoint in OpsHandler
- Backend: Add partition query methods in OpsRepository
- Backend: Add UsageLogsPartitionStatus type in OpsService
- Frontend: Add OpsPartitionStatusCard component
- Frontend: Add partition status display in OpsDashboard
- i18n: Add Chinese and English translations
The partition status card shows:
- Whether usage_logs is partitioned
- Current row count vs threshold (100K)
- Partition count (if partitioned)
- Warning message when partitioning is recommended
This allows administrators to monitor partition status directly
from the ops dashboard without checking server logs.
Add CheckUsageLogsPartitioning function that:
- Checks if usage_logs table is partitioned
- Warns with prominent banner if not partitioned and rows > 100K
- Provides actionable guidance for manual partition migration
This helps operators identify performance risks early and take
appropriate action before data volume causes issues.
Add prominent warning messages when JWT secret is auto-generated:
- Use multi-line banner format for better visibility
- Include actionable guidance for production deployments
- Update both setup.go and security_secret_bootstrap.go
This helps operators notice the security concern and take
appropriate action before deploying to production.
Restore gateway_service.go, setting_handler.go, routes/admin.go,
dto/settings.go, group_repo.go, api_key_repo.go, wire_gen.go to
upstream/main versions and surgically remove only Sora references.
This preserves upstream-only features (RequireOauthOnly, RequirePrivacySet,
GroupResolution, etc.) that were missing when using release branch versions.
- Remove media_type column from all INSERT/SELECT/SCAN in usage_log_repo
- Remove media_type mock arg from request_type and integration tests
- Adjust scan stub value arrays from 47 to 46 elements
- GetGroupPlatforms failure now stores error-TTL cache and returns error (fail-close)
- Frontend group-to-channel conflict map loads all channels instead of current page only
- Toggle channel status reloads list when active filter would hide the changed item
- Fix errcheck: defer rows.Close() with nolint
- Fix errcheck: type assertion with ok check in channel cache
- Fix staticcheck ST1005: lowercase error string
- Fix staticcheck SA5011: nil check cost before use in openai gateway
- Fix gofmt: format chatcompletions_to_responses.go
- Parse candidatesTokensDetails from Gemini API to separate image/text output tokens
- Add image_output_tokens and image_output_cost to usage_log (migration 089)
- Support per-image-token pricing via output_cost_per_image_token from model pricing data
- Channel pricing ImageOutputPrice override works in token billing mode
- Auto-fill image_output_price in channel pricing form from model defaults
- Add "channel_mapped" billing model source as new default (migration 088)
- Bills by model name after channel mapping, before account mapping
- Fix channel cache error TTL sign error (115s → 5s)
- Fix Update channel only invalidating new groups, not removed groups
- Fix frontend model_mapping clearing sending undefined instead of {}
- Credits balance precheck via shared AccountUsageService cache before injection
- Skip credits injection for accounts with insufficient balance
- Don't mark credits exhausted for "exhausted your capacity on this model" 429s
Move constants, detection, and penalty functions from
antigravity_gateway_service.go to antigravity_internal500_penalty.go.
Fix gofmt alignment and replace hardcoded duration strings with
constant references.
## Problem
When a proxy is unreachable, token refresh retries up to 4 times with
30s timeout each, causing requests to hang for ~2 minutes before
failing with a generic 502 error. The failed account is not marked,
so subsequent requests keep hitting it.
## Changes
### Proxy connection fast-fail
- Set TCP dial timeout to 5s and TLS handshake timeout to 5s on
antigravity client, so proxy connectivity issues fail within 5s
instead of 30s
- Reduce overall HTTP client timeout from 30s to 10s
- Export `IsConnectionError` for service-layer use
- Detect proxy connection errors in `RefreshToken` and return
immediately with "proxy unavailable" error (no retries)
### Token refresh temp-unschedulable
- Add 8s context timeout for token refresh on request path
- Mark account as temp-unschedulable for 10min when refresh fails
(both background `TokenRefreshService` and request-path
`GetAccessToken`)
- Sync temp-unschedulable state to Redis cache for immediate
scheduler effect
- Inject `TempUnschedCache` into `AntigravityTokenProvider`
### Account failover
- Return `UpstreamFailoverError` on `GetAccessToken` failure in
`Forward`/`ForwardGemini` to trigger handler-level account switch
instead of returning 502 directly
### Proxy probe alignment
- Apply same 5s dial/TLS timeout to shared `httpclient` pool
- Reduce proxy probe timeout from 30s to 10s