jim800121chen d8a9517c9d feat(task-scheduler): Phase 0.8b — API key auth + /result endpoint
Auth pillar 從 OAuth 2.0 resource server 改成 pre-shared API key
(visionA ↔ converter 1:1 internal trust)。新增 GET /api/v1/jobs/:id/result
streaming endpoint 給 visionA backend 中轉 NEF 下載。

Phase A(auth 切換):
- 新增 apiKeyMiddleware(constant-time compare、tokenFingerprint、4 audit events)
- 砍 OAuth middleware + JWKS(保留 oauthClient 供 promote → FAA 使用)
- 4 個 endpoint 換掛 requireApiKey
- 加 TRUST_PROXY env + Express trust proxy 設定(forensic source_ip)

Phase B(/result endpoint):
- streaming NEF download with 5min timeout + concurrent cap 10
- Two-tier rate limit(burst 5/10s + sustained 20/min)
- Bandwidth quota(1 GB/hr + 6 GB/24hr)by token_fingerprint
- Range header silently ignored + Accept-Ranges: none
- filename quote-escape + RFC 5987 fallback + sanitize
- 8 個 /result audit events(forensic 完整)

設計演進記錄:docs/TODO-visionA-integration-v2.md(5/2 OAuth → 5/16 API key
→ 5/16 download via converter;對應 visionA repo ADR-015/016)

Tests: 597 → 666 (+69)、29 suites all pass
Security: APPROVE WITH CONDITIONS(單 instance 部署、6 新 env、24hr 監控)
npm audit: 3 vuln → 0(transitive AWS SDK xml chain)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 22:47:28 +08:00

159 lines
5.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Performance 設計
> **狀態**Phase 1 完工 — Phase 0.8b 新增 `/result` endpoint 的延遲預算。
>
> **配套**`design-doc.md` §6、`api/api-result.md`。
---
## 1. SLO 總表
| 端點 | SLI | SLO | 量測方式 |
|------|-----|-----|---------|
| `/api/v1/*` 可用率 | 2xx+3xx / 總請求 | ≥ 99.5%(工作時段)| Nginx access log + structured app log |
| `GET /api/v1/jobs/:id` p95 | 回應時間 95 百分位 | < 200ms | structured log duration_ms |
| `GET /api/v1/jobs` p95 | 回應時間 95 百分位 | < 500ms | 同上 |
| `POST /api/v1/jobs` p95200MB| multipart 上傳到 MinIO 寫完 | < 5s | 同上 |
| `POST /api/v1/jobs` p95500MB| 同上 | < 12s | 同上 |
| `POST /api/v1/jobs/:id/promote` p95 | 回應時間 95 百分位 | < 3s | 同上 |
| `GET /api/v1/jobs/:id/result` TTFB | First byte 時間 | < 500ms | Nginx access log + app log |
| `GET /api/v1/jobs/:id/result` 完整下載50MB| 完整 stream 結束時間 | < 2s @ 50MB/s 鏈路 | client 端量 |
| API key 驗證失敗率 | 401 / 總請求 | < 0.1% caller 不該錯 | structured log |
---
## 2. 延遲預算
### 2.1 `POST /api/v1/jobs`200MB 檔案)
| 階段 | 預算 | 備註 |
|------|------|------|
| Nginx ingress multipart 前置| 10ms | |
| API key constant-time compare | 1ms | 64 hex chars 比對 |
| Multer 接收memory| 4000ms | 200MB @ 50MB/s |
| Validation欄位mimetype副檔名| 20ms | |
| Upload concurrency semaphore | 0-1000ms | 高並發時可能等 |
| Redis active_job | 10ms | |
| MinIO PutObject | 1000ms | 200MB @ 200MB/s |
| Redis job record + 索引Lua| 20ms | |
| Enqueue Redis Stream | 10ms | |
| **總預算 (200MB)** | **~5s p95** | |
| **總預算 (500MB)** | **~12s p95** | 4 multipart 時間 |
### 2.2 `GET /api/v1/jobs/:id`
| 階段 | 預算 |
|------|------|
| Nginx ingress | 5ms |
| API key compare | 1ms |
| Rate limiter check | 2ms |
| Redis GET job:{id} | 10ms |
| Client 隔離檢查 | 1ms |
| Status mapping + serialize | 5ms |
| ETag 計算 + compare | 2ms |
| **總預算** | **~30ms p50、< 200ms p95** |
### 2.3 `POST /api/v1/jobs/:id/promote`(單 target50MB NEF
| 階段 | 預算 |
|------|------|
| Nginx ingress | 5ms |
| API key compare | 1ms |
| Redis GET job | 10ms |
| 冪等性 check | 1ms |
| MinIO HEAD object | 50ms |
| OAuth token cache hit | 1ms |
| FAA PUT50MB @ 100MB/s| 500ms |
| Redis SETmarkPromoted| 10ms |
| **總預算** | **~600ms p50、< 3s p95** |
OAuth token cache miss 會多 200-500ms token 一次)。
### 2.4 `GET /api/v1/jobs/:id/result`Phase 0.8b 新增)
| 階段 | 預算 |
|------|------|
| Nginx ingress | 5ms |
| API key compare | 1ms |
| Rate limiter check | 2ms |
| Redis GET job | 10ms |
| Status / expires_at check | 1ms |
| MinIO GET stream init HEAD| 50ms |
| **TTFB** | **~70ms p50、< 500ms p95** |
| Stream NEF50MB @ 50MB/s| 1000msclient 端量 |
**TTFB**headers 送出 + 第一個 byte 到達 client這是 `/result` 的關鍵 SLO完整下載時間取決於檔案大小和鏈路頻寬不算 Scheduler SLO
---
## 3. Token cache 策略
| Cache | TTL | 退出條件 |
|-------|-----|---------|
| ~~JWKS~~ | ~~10 min~~ | ~~遇到未知 kid 強制 refresh~~ | **Phase 0.8b 移除** |
| FAA service tokenpromote | `expires_in - 60s` | 遇到 401 強制 refresh |
---
## 4. Rate Limit 策略
| 範圍 | 限制 | 動機 |
|------|------|------|
| 全局IP| 200 req / 15min | 防匿名流量 / DDoS既有|
| Per `client_id`API key 模式下固定 `visionA-service`| 300 req / 5min | polling 暴衝 |
**Phase 0.8b 思考**API key 模式下只有 1 caller`client_id` 固定值per-client rate limit 等於全局 limit仍保留 per-client 結構未來真有多 caller 時自動分流
---
## 5. Streaming 記憶體足跡
### 5.1 `/result` streamPhase 0.8b 新增)
NEF 50MB stream
- MinIO clientaws-sdk通常 16KB-64KB internal buffer
- Node HTTP responsehighWaterMark 預設 16KB
- 整段 stream 期間 Scheduler heap 增量**< 200KB per request**
1000 並發 stream 估算 < 200MB heap可接受
### 5.2 `POST /api/v1/jobs` multipart memoryStorage既有
500MB multipart 寫入 memory
- 單個 request500MB peak heap
- 5 並發`MAX_CONCURRENT_UPLOADS=5`2.5GB peak
- 容器 RAM 4GB
---
## 6. 觀測
詳見 `observability.md`
每個 endpoint log
- `action` `result.success``promote.success`
- `request_id`
- `client_id``user_id`如可取
- `duration_ms` handler start response end
- 失敗時`error_code``error_name` log error_message 內容避免洩漏
---
## 7. 負載測試計畫
| 類型 | 持續 | 目的 | 負載 |
|------|------|------|------|
| Steady-state | 30 min | 基線驗證 | 預估 QPSvisionA polling 5 user × 1 req/2s = 2.5 QPS|
| Step-load | 5min 增量 | 找擴展極限 | 逐步增加到 50 QPS |
| Spike | 瞬間 | 突發流量 | 100 QPS5x 基線|
| Soak | 6 小時 | 記憶體洩漏 | 5 QPS 穩定 |
`/result` 特別測
- 大檔 stream200MB NEF)— OOM 測試
- 多並發 stream20 client 同時下載)— 確認 Scheduler 不掛
- Slow clientclient 收得慢)— 確認 stream 不堆 buffer