Auth pillar 從 OAuth 2.0 resource server 改成 pre-shared API key (visionA ↔ converter 1:1 internal trust)。新增 GET /api/v1/jobs/:id/result streaming endpoint 給 visionA backend 中轉 NEF 下載。 Phase A(auth 切換): - 新增 apiKeyMiddleware(constant-time compare、tokenFingerprint、4 audit events) - 砍 OAuth middleware + JWKS(保留 oauthClient 供 promote → FAA 使用) - 4 個 endpoint 換掛 requireApiKey - 加 TRUST_PROXY env + Express trust proxy 設定(forensic source_ip) Phase B(/result endpoint): - streaming NEF download with 5min timeout + concurrent cap 10 - Two-tier rate limit(burst 5/10s + sustained 20/min) - Bandwidth quota(1 GB/hr + 6 GB/24hr)by token_fingerprint - Range header silently ignored + Accept-Ranges: none - filename quote-escape + RFC 5987 fallback + sanitize - 8 個 /result audit events(forensic 完整) 設計演進記錄:docs/TODO-visionA-integration-v2.md(5/2 OAuth → 5/16 API key → 5/16 download via converter;對應 visionA repo ADR-015/016) Tests: 597 → 666 (+69)、29 suites all pass Security: APPROVE WITH CONDITIONS(單 instance 部署、6 新 env、24hr 監控) npm audit: 3 vuln → 0(transitive AWS SDK xml chain) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
188 lines
5.5 KiB
Markdown
188 lines
5.5 KiB
Markdown
# Database 設計
|
||
|
||
> **狀態**:Phase 1 完工 — Phase 0.8b **完全不動**。
|
||
>
|
||
> **配套**:`design-doc.md` §3.7、`api/api-jobs.md`。
|
||
|
||
---
|
||
|
||
## 1. 為什麼用 Redis、不用 PostgreSQL
|
||
|
||
- Phase 1 資料模式簡單:job 是 state machine、user index 是 key-value
|
||
- 既有哲學「Crash 即 Reset」對 Redis 友善(PG 引入持久化反而變複雜)
|
||
- Redis Set 做 user 索引足夠(單 user 7 天內 < 10 個 job)
|
||
- 未來若要跨 Crash recovery / 多 instance HA,再評估 PG
|
||
|
||
---
|
||
|
||
## 2. Key 規劃
|
||
|
||
| Key | 類型 | 用途 | TTL |
|
||
|-----|------|------|-----|
|
||
| `job:{job_id}` | String (JSON) | Job 完整 record | 7 天 |
|
||
| `user:{user_id}:jobs` | Set | 該 user 所有 job_id(不分狀態) | 每次寫入時 `EXPIRE 7d` |
|
||
| `user:{user_id}:active_job` | String | 當前 in-progress job_id(= `created` 或 `running`)| 隨 job 結束刪除 |
|
||
| `ratelimit:client:{client_id}` | 由 `express-rate-limit` 管理 | per-client_id rate limit | 5 min |
|
||
| `queue:onnx` / `queue:bie` / `queue:nef` | Redis Stream | Worker 任務佇列 | — |
|
||
| `queue:done` | Redis Stream | Worker 完成事件 | — |
|
||
| `queue:progress` | Redis Stream | Worker stage 內進度(選配,Phase 2)| — |
|
||
|
||
---
|
||
|
||
## 3. Job record schema
|
||
|
||
```jsonc
|
||
{
|
||
// 既有欄位
|
||
"job_id": "uuid",
|
||
"created_at": "...",
|
||
"updated_at": "...",
|
||
"status": "ONNX | BIE | NEF | COMPLETED | FAILED", // 內部仍用大寫
|
||
"stage": "onnx | bie | nef | null",
|
||
"progress": 0,
|
||
"parameters": {
|
||
"model_id": 1001,
|
||
"version": "0001",
|
||
"platform": "520",
|
||
"enable_evaluate": false,
|
||
"enable_sim_fp": false,
|
||
"enable_sim_fixed": false,
|
||
"enable_sim_hw": false
|
||
},
|
||
"output": { // 舊格式(向後相容)
|
||
"bie_path": null,
|
||
"nef_path": null,
|
||
"onnx_path": null
|
||
},
|
||
"result_object_keys": { // 新格式
|
||
"onnx": "jobs/{job_id}/output/out.onnx",
|
||
"bie": "jobs/{job_id}/output/out.bie",
|
||
"nef": "jobs/{job_id}/output/out.nef"
|
||
},
|
||
"error": null,
|
||
"origin": "api | web",
|
||
"user_id": "visionA-user-12345",
|
||
"tenant_id": "uuid-or-null",
|
||
"created_by_client_id": "visionA-service", // API key 模式下固定值
|
||
"source_filename": "model.onnx", // Phase 0.8b 新增(/result endpoint filename 用)
|
||
"input": {
|
||
"filename": "model.onnx",
|
||
"object_key": "jobs/{job_id}/input/model.onnx",
|
||
"size_bytes": 204800000,
|
||
"ref_images_count": 0
|
||
},
|
||
"stage_timings": {
|
||
"onnx": { "started_at": "...", "completed_at": "..." },
|
||
"bie": { "started_at": "...", "completed_at": null },
|
||
"nef": null
|
||
},
|
||
"stage_progress": 0,
|
||
"expires_at": "2026-05-23T12:00:00Z",
|
||
"metadata": {},
|
||
"promoted": false, // 冪等性 flag
|
||
"promoted_object_keys": [] // 已 promote 的目標
|
||
}
|
||
```
|
||
|
||
### 3.1 `source_filename` 欄位
|
||
|
||
Phase 0.8b 新增需求:`/result` endpoint 需要這個欄位構造 download filename。
|
||
|
||
**寫入點**:`POST /api/v1/jobs` handler 在 multer 接收 `model` 檔後,把 `multipart.filename` 寫入 `job.source_filename`(已 sanitized)。
|
||
|
||
**Backend 端 task**:確認 `jobService.createJob` 寫入這個欄位(檢查既有 code、可能已存在;若沒有則補上)。
|
||
|
||
---
|
||
|
||
## 4. 對外 status 映射(不變)
|
||
|
||
詳見 `api/api-jobs.md` §5.3。
|
||
|
||
---
|
||
|
||
## 5. User 索引設計
|
||
|
||
### 5.1 Key 寫入時機
|
||
|
||
```
|
||
建立 job:
|
||
MULTI
|
||
SET job:{id} {...}
|
||
SADD user:{user_id}:jobs {id}
|
||
EXPIRE user:{user_id}:jobs 604800
|
||
SETNX user:{user_id}:active_job {id}
|
||
EXEC
|
||
|
||
若 SETNX 回 0 → 衝突,回滾,回 409
|
||
若 SETNX 回 1 → 成功
|
||
|
||
完成 / 失敗時:
|
||
MULTI
|
||
SET job:{id} {...}
|
||
DEL user:{user_id}:active_job # 僅在 value == 當前 job_id 時才 DEL
|
||
EXEC
|
||
```
|
||
|
||
### 5.2 Lua script(claim_active_job)
|
||
|
||
```lua
|
||
-- KEYS[1] = user:{user_id}:active_job
|
||
-- KEYS[2] = job:{job_id}
|
||
-- KEYS[3] = user:{user_id}:jobs
|
||
-- ARGV[1] = job_id
|
||
-- ARGV[2] = job_json
|
||
-- ARGV[3] = ttl_seconds
|
||
|
||
if redis.call('EXISTS', KEYS[1]) == 1 then
|
||
return {'conflict', redis.call('GET', KEYS[1])}
|
||
end
|
||
redis.call('SET', KEYS[1], ARGV[1])
|
||
redis.call('SET', KEYS[2], ARGV[2])
|
||
redis.call('SADD', KEYS[3], ARGV[1])
|
||
redis.call('EXPIRE', KEYS[3], tonumber(ARGV[3]))
|
||
return {'ok'}
|
||
```
|
||
|
||
### 5.3 避免 `KEYS *`
|
||
|
||
**錯誤做法**:`redis.keys('job:*')` O(N) 阻塞。
|
||
|
||
**正確做法**:
|
||
```javascript
|
||
const ids = await redis.smembers(`user:${userId}:jobs`);
|
||
const pipeline = redis.pipeline();
|
||
for (const id of ids) pipeline.get(`job:${id}`);
|
||
const results = await pipeline.exec();
|
||
```
|
||
|
||
---
|
||
|
||
## 6. 記憶體預估
|
||
|
||
- 每個 job record 約 2-4 KB(含 stage_timings 等)
|
||
- 每個 user index Set 每個元素 < 40 bytes
|
||
- 1000 並發 user × 10 jobs = 10k job record ≈ 40 MB
|
||
|
||
Redis 輕鬆。Converter Bucket lifecycle 7 天,Redis 也跟著 TTL 7 天,記憶體上限可控。
|
||
|
||
---
|
||
|
||
## 7. M5 方案 A:先寫 MinIO 後 Lua claim
|
||
|
||
避免「拿到 Lua claim 但 MinIO 失敗」需要 rollback Redis 的複雜度:
|
||
|
||
- MinIO 失敗 → 直接回 502,Redis 完全乾淨
|
||
- Lua conflict / throw → cleanup MinIO(fire-and-forget,靠 7d lifecycle 兜底)
|
||
- enqueue 失敗 → 補償 release Redis + cleanup MinIO
|
||
|
||
---
|
||
|
||
## 8. Phase 0.8b 變動
|
||
|
||
**無**。Database 完全不動。
|
||
|
||
唯一相關變動:
|
||
|
||
- `created_by_client_id` 在 API key 模式下會固定為 `visionA-service`(middleware 設定 `req.auth.clientId`)— 此為 handler 行為,不是 schema 改變
|
||
- `source_filename` 欄位確認存在(既有實作可能已有;若無,Backend 補上 — 屬於 Phase B 任務)
|