jim800121chen d8a9517c9d feat(task-scheduler): Phase 0.8b — API key auth + /result endpoint
Auth pillar 從 OAuth 2.0 resource server 改成 pre-shared API key
(visionA ↔ converter 1:1 internal trust)。新增 GET /api/v1/jobs/:id/result
streaming endpoint 給 visionA backend 中轉 NEF 下載。

Phase A(auth 切換):
- 新增 apiKeyMiddleware(constant-time compare、tokenFingerprint、4 audit events)
- 砍 OAuth middleware + JWKS(保留 oauthClient 供 promote → FAA 使用)
- 4 個 endpoint 換掛 requireApiKey
- 加 TRUST_PROXY env + Express trust proxy 設定(forensic source_ip)

Phase B(/result endpoint):
- streaming NEF download with 5min timeout + concurrent cap 10
- Two-tier rate limit(burst 5/10s + sustained 20/min)
- Bandwidth quota(1 GB/hr + 6 GB/24hr)by token_fingerprint
- Range header silently ignored + Accept-Ranges: none
- filename quote-escape + RFC 5987 fallback + sanitize
- 8 個 /result audit events(forensic 完整)

設計演進記錄:docs/TODO-visionA-integration-v2.md(5/2 OAuth → 5/16 API key
→ 5/16 download via converter;對應 visionA repo ADR-015/016)

Tests: 597 → 666 (+69)、29 suites all pass
Security: APPROVE WITH CONDITIONS(單 instance 部署、6 新 env、24hr 監控)
npm audit: 3 vuln → 0(transitive AWS SDK xml chain)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 22:47:28 +08:00

5.7 KiB
Raw Blame History

API: POST /api/v1/jobs/:id/promote

狀態Phase 1 完工 — Phase 0.8b 完全保留,只是對外 auth 換成 API keyconverter → FAA 仍走 OAuth client_credentials

配套auth.md §2converter → FAA OAuth client 設計)。


1. 用途

把 Converter Bucket 中的轉檔結果檔onnx / bie / nefPUT 到 FAA NAS Bucket長期儲存


2. Request

POST /api/v1/jobs/550e8400-.../promote
Authorization: Bearer <CONVERTER_API_KEY>
Content-Type: application/json

{
  "targets": [
    {
      "source": "nef",
      "target_object_key": "visionA/models/user-12345/model-1001/v0001/out.nef"
    },
    {
      "source": "bie",
      "target_object_key": "visionA/models/user-12345/model-1001/v0001/out.bie"
    }
  ]
}

2.1 Body

欄位 類型 必填 說明
targets array 至少 1 個,最多 10 個
targets[].source string enum: onnx, bie, nef
targets[].target_object_key string FAA 的目標 keyvisionA 決定命名);長度 ≤ 1024、不可含 .. / \ / 控制字元 / 開頭 / / ? / # / %

3. Response 200

{
  "job_id": "550e8400-...",
  "promoted": [
    {
      "source": "nef",
      "target_object_key": "visionA/models/user-12345/model-1001/v0001/out.nef",
      "size_bytes": 10485760,
      "file_access_agent_etag": "abc123",
      "promoted_at": "2026-05-16T12:30:00Z"
    },
    {
      "source": "bie",
      "target_object_key": "...",
      "size_bytes": 5242880,
      "file_access_agent_etag": "def456",
      "promoted_at": "..."
    }
  ]
}

4. Error Responses

HTTP error.code 情境
400 validation_error targets 格式錯、source 非合法 stage、duplicate source
401 invalid_token API key 缺 / 不符
404 job_not_found job 不存在
409 job_not_ready_for_promote status != COMPLETEDdetails.current_status
409 source_not_available job 沒產這個 stage 的結果
422 invalid_object_key target_object_key 格式不合法(含 reason
502 file_gateway_unavailable FAA PUT 失敗4xx / 5xx / timeout 已重試 3 次)
502 storage_unavailable MinIO HEAD / GET 失敗
503 auth_service_unavailable 取 FAA token 失敗401 已 invalidate + retry 仍失敗)

5. 冪等性

promote 對同樣 target_object_key PUT 兩次結果一樣FAA 會覆蓋)。

Two-layer 冪等性(保留 Phase 1 實作):

  1. Job-leveljob.promoted === true → 直接回 200 + 既有 promoted_object_keys,不重打 FAA
  2. FAA-levelFAA PUT 本身冪等,重試安全

6. 實作流程

1. requireApiKey() → 401
2. perClientLimiter → 429
3. validate body → 400 / 422
4. jobService.getJob(id) + client 隔離 → 404
5. 冪等性 checkjob.promoted === true → return 200
6. status === 'COMPLETED' check → 409
7. for each target (序列):
   a. getJobOutputKey(job, target.source) → 409 source_not_available
   b. minio.headObject(sourceKey) → 502 storage_unavailable
   c. oauthClient.getServiceToken('files:upload.write')   ← OAuth client保留
   d. faaClient.putFile(targetKey, streamFactory, ...) → 502 / 503
   e. 收集 promoted result
8. jobService.markPromoted(jobId, ...) → log ERROR if 失敗(但 client 仍回 200因為檔案實際已搬完
9. return 200 + { job_id, promoted: [...] }

7. 重要決策(保留 Phase 1

7.1 序列 promote 各 target

為什麼序列

  • FAA 端對單一 client 並發可能有限制
  • 失敗時容易判斷哪個 target 已成功
  • 大檔串流並發會放大記憶體 / CPU 壓力

7.2 Stream factory pattern

faaClient.putFile 接受 streamFactory: () => Promise<Stream>,每次 attempt 才呼叫 minio.getObjectStream 拿新 stream。

為什麼HTTP body 不可 replayattempt #1 5xx 失敗attempt #2 必須拿新 stream。

7.3 Target_object_key 安全檢查

拒絕:

  • 空字串、超長(> 1024
  • 開頭 /(避免被 FAA 解讀為絕對路徑)
  • ..(路徑穿越)
  • \Windows 路徑 / URL 注入)
  • \0 / 控制字元(\x00-\x1F\x7F
  • ? / #URL query / fragment 注入)
  • %(雙重編碼攻擊,避免 %2E%2E 解碼為 ..

7.4 FAA 錯誤分類

FAA 錯誤 轉換成 v1 ApiError
FAAUnauthorizedError(已 retry 仍 401 503 auth_service_unavailable
FAAClientError4xx 非 401 502 file_gateway_unavailable(拒絕細節,避免洩漏 FAA 內部訊息)
FAAServerError5xx/ FAATimeoutError 502 file_gateway_unavailable
其他 500 internal_error

7.5 FAA 重試策略

  • 4xx 非 401不重試client error重試無益
  • 401oauthClient.invalidate(scope) + retry 1 次;仍 401 → 503
  • 5xx / timeout / network重試 2 次exponential backoff 500ms / 2000ms全失敗 → 502

7.6 markPromoted 失敗的處理

FAA 已成功(檔案在 NAS 上)但 Redis markPromoted 失敗:

  • Log ERROR
  • 仍回 200 給 client檔案實際已搬完
  • 下次 promote 同 job 時 markPromoted 會再嘗試FAA PUT 冪等)
  • 副作用client 後續呼叫不會走 idempotent path、會再 PUT 一次(無害)

8. Curl 範例

curl -X POST https://converter.innovedus.com/api/v1/jobs/550e8400-.../promote \
  -H "Authorization: Bearer $CONVERTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "targets": [
      {"source": "nef", "target_object_key": "visionA/models/u-12345/m-1001/v0001/out.nef"}
    ]
  }'