jim800121chen d8a9517c9d feat(task-scheduler): Phase 0.8b — API key auth + /result endpoint
Auth pillar 從 OAuth 2.0 resource server 改成 pre-shared API key
(visionA ↔ converter 1:1 internal trust)。新增 GET /api/v1/jobs/:id/result
streaming endpoint 給 visionA backend 中轉 NEF 下載。

Phase A(auth 切換):
- 新增 apiKeyMiddleware(constant-time compare、tokenFingerprint、4 audit events)
- 砍 OAuth middleware + JWKS(保留 oauthClient 供 promote → FAA 使用)
- 4 個 endpoint 換掛 requireApiKey
- 加 TRUST_PROXY env + Express trust proxy 設定(forensic source_ip)

Phase B(/result endpoint):
- streaming NEF download with 5min timeout + concurrent cap 10
- Two-tier rate limit(burst 5/10s + sustained 20/min)
- Bandwidth quota(1 GB/hr + 6 GB/24hr)by token_fingerprint
- Range header silently ignored + Accept-Ranges: none
- filename quote-escape + RFC 5987 fallback + sanitize
- 8 個 /result audit events(forensic 完整)

設計演進記錄:docs/TODO-visionA-integration-v2.md(5/2 OAuth → 5/16 API key
→ 5/16 download via converter;對應 visionA repo ADR-015/016)

Tests: 597 → 666 (+69)、29 suites all pass
Security: APPROVE WITH CONDITIONS(單 instance 部署、6 新 env、24hr 監控)
npm audit: 3 vuln → 0(transitive AWS SDK xml chain)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 22:47:28 +08:00

184 lines
5.7 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# API: `POST /api/v1/jobs/:id/promote`
> **狀態**Phase 1 完工 — Phase 0.8b 完全保留,只是對外 auth 換成 API keyconverter → FAA 仍走 OAuth client_credentials
>
> **配套**`auth.md` §2converter → FAA OAuth client 設計)。
---
## 1. 用途
把 Converter Bucket 中的轉檔結果檔onnx / bie / nefPUT 到 FAA NAS Bucket長期儲存
---
## 2. Request
```http
POST /api/v1/jobs/550e8400-.../promote
Authorization: Bearer <CONVERTER_API_KEY>
Content-Type: application/json
{
"targets": [
{
"source": "nef",
"target_object_key": "visionA/models/user-12345/model-1001/v0001/out.nef"
},
{
"source": "bie",
"target_object_key": "visionA/models/user-12345/model-1001/v0001/out.bie"
}
]
}
```
### 2.1 Body
| 欄位 | 類型 | 必填 | 說明 |
|------|------|------|------|
| `targets` | array | ✅ | 至少 1 個,最多 10 個 |
| `targets[].source` | string | ✅ | enum: `onnx`, `bie`, `nef` |
| `targets[].target_object_key` | string | ✅ | FAA 的目標 keyvisionA 決定命名);長度 ≤ 1024、不可含 `..` / `\` / 控制字元 / 開頭 `/` / `?` / `#` / `%` |
---
## 3. Response 200
```json
{
"job_id": "550e8400-...",
"promoted": [
{
"source": "nef",
"target_object_key": "visionA/models/user-12345/model-1001/v0001/out.nef",
"size_bytes": 10485760,
"file_access_agent_etag": "abc123",
"promoted_at": "2026-05-16T12:30:00Z"
},
{
"source": "bie",
"target_object_key": "...",
"size_bytes": 5242880,
"file_access_agent_etag": "def456",
"promoted_at": "..."
}
]
}
```
---
## 4. Error Responses
| HTTP | error.code | 情境 |
|------|-----------|------|
| 400 | `validation_error` | targets 格式錯、source 非合法 stage、duplicate source |
| 401 | `invalid_token` | API key 缺 / 不符 |
| 404 | `job_not_found` | job 不存在 |
| 409 | `job_not_ready_for_promote` | status != COMPLETED`details.current_status`|
| 409 | `source_not_available` | job 沒產這個 stage 的結果 |
| 422 | `invalid_object_key` | target_object_key 格式不合法(含 reason|
| 502 | `file_gateway_unavailable` | FAA PUT 失敗4xx / 5xx / timeout 已重試 3 次)|
| 502 | `storage_unavailable` | MinIO HEAD / GET 失敗 |
| 503 | `auth_service_unavailable` | 取 FAA token 失敗401 已 invalidate + retry 仍失敗)|
---
## 5. 冪等性
`promote` 對同樣 `target_object_key` PUT 兩次結果一樣FAA 會覆蓋)。
**Two-layer 冪等性**(保留 Phase 1 實作):
1. **Job-level**`job.promoted === true` → 直接回 200 + 既有 `promoted_object_keys`,不重打 FAA
2. **FAA-level**FAA PUT 本身冪等,重試安全
---
## 6. 實作流程
```
1. requireApiKey() → 401
2. perClientLimiter → 429
3. validate body → 400 / 422
4. jobService.getJob(id) + client 隔離 → 404
5. 冪等性 checkjob.promoted === true → return 200
6. status === 'COMPLETED' check → 409
7. for each target (序列):
a. getJobOutputKey(job, target.source) → 409 source_not_available
b. minio.headObject(sourceKey) → 502 storage_unavailable
c. oauthClient.getServiceToken('files:upload.write') ← OAuth client保留
d. faaClient.putFile(targetKey, streamFactory, ...) → 502 / 503
e. 收集 promoted result
8. jobService.markPromoted(jobId, ...) → log ERROR if 失敗(但 client 仍回 200因為檔案實際已搬完
9. return 200 + { job_id, promoted: [...] }
```
---
## 7. 重要決策(保留 Phase 1
### 7.1 序列 promote 各 target
**為什麼序列**
- FAA 端對單一 client 並發可能有限制
- 失敗時容易判斷哪個 target 已成功
- 大檔串流並發會放大記憶體 / CPU 壓力
### 7.2 Stream factory pattern
`faaClient.putFile` 接受 `streamFactory: () => Promise<Stream>`,每次 attempt 才呼叫 `minio.getObjectStream` 拿新 stream。
**為什麼**HTTP body 不可 replayattempt #1 5xx 失敗attempt #2 必須拿新 stream。
### 7.3 Target_object_key 安全檢查
拒絕:
- 空字串、超長(> 1024
- 開頭 `/`(避免被 FAA 解讀為絕對路徑)
-`..`(路徑穿越)
-`\`Windows 路徑 / URL 注入)
-`\0` / 控制字元(`\x00-\x1F``\x7F`
-`?` / `#`URL query / fragment 注入)
-`%`(雙重編碼攻擊,避免 `%2E%2E` 解碼為 `..`
### 7.4 FAA 錯誤分類
| FAA 錯誤 | 轉換成 v1 ApiError |
|---------|-------------------|
| `FAAUnauthorizedError`(已 retry 仍 401| 503 `auth_service_unavailable` |
| `FAAClientError`4xx 非 401| 502 `file_gateway_unavailable`(拒絕細節,避免洩漏 FAA 內部訊息)|
| `FAAServerError`5xx/ `FAATimeoutError` | 502 `file_gateway_unavailable` |
| 其他 | 500 `internal_error` |
### 7.5 FAA 重試策略
- 4xx 非 401不重試client error重試無益
- 401`oauthClient.invalidate(scope)` + retry 1 次;仍 401 → 503
- 5xx / timeout / network重試 2 次exponential backoff 500ms / 2000ms全失敗 → 502
### 7.6 markPromoted 失敗的處理
FAA 已成功(檔案在 NAS 上)但 Redis `markPromoted` 失敗:
- Log ERROR
- 仍回 200 給 client檔案實際已搬完
- 下次 promote 同 job 時 `markPromoted` 會再嘗試FAA PUT 冪等)
- 副作用client 後續呼叫不會走 idempotent path、會再 PUT 一次(無害)
---
## 8. Curl 範例
```bash
curl -X POST https://converter.innovedus.com/api/v1/jobs/550e8400-.../promote \
-H "Authorization: Bearer $CONVERTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"targets": [
{"source": "nef", "target_object_key": "visionA/models/u-12345/m-1001/v0001/out.nef"}
]
}'
```