Auth pillar 從 OAuth 2.0 resource server 改成 pre-shared API key (visionA ↔ converter 1:1 internal trust)。新增 GET /api/v1/jobs/:id/result streaming endpoint 給 visionA backend 中轉 NEF 下載。 Phase A(auth 切換): - 新增 apiKeyMiddleware(constant-time compare、tokenFingerprint、4 audit events) - 砍 OAuth middleware + JWKS(保留 oauthClient 供 promote → FAA 使用) - 4 個 endpoint 換掛 requireApiKey - 加 TRUST_PROXY env + Express trust proxy 設定(forensic source_ip) Phase B(/result endpoint): - streaming NEF download with 5min timeout + concurrent cap 10 - Two-tier rate limit(burst 5/10s + sustained 20/min) - Bandwidth quota(1 GB/hr + 6 GB/24hr)by token_fingerprint - Range header silently ignored + Accept-Ranges: none - filename quote-escape + RFC 5987 fallback + sanitize - 8 個 /result audit events(forensic 完整) 設計演進記錄:docs/TODO-visionA-integration-v2.md(5/2 OAuth → 5/16 API key → 5/16 download via converter;對應 visionA repo ADR-015/016) Tests: 597 → 666 (+69)、29 suites all pass Security: APPROVE WITH CONDITIONS(單 instance 部署、6 新 env、24hr 監控) npm audit: 3 vuln → 0(transitive AWS SDK xml chain) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
159 lines
5.3 KiB
Markdown
159 lines
5.3 KiB
Markdown
# Performance 設計
|
||
|
||
> **狀態**:Phase 1 完工 — Phase 0.8b 新增 `/result` endpoint 的延遲預算。
|
||
>
|
||
> **配套**:`design-doc.md` §6、`api/api-result.md`。
|
||
|
||
---
|
||
|
||
## 1. SLO 總表
|
||
|
||
| 端點 | SLI | SLO | 量測方式 |
|
||
|------|-----|-----|---------|
|
||
| `/api/v1/*` 可用率 | 2xx+3xx / 總請求 | ≥ 99.5%(工作時段)| Nginx access log + structured app log |
|
||
| `GET /api/v1/jobs/:id` p95 | 回應時間 95 百分位 | < 200ms | structured log 的 duration_ms |
|
||
| `GET /api/v1/jobs` p95 | 回應時間 95 百分位 | < 500ms | 同上 |
|
||
| `POST /api/v1/jobs` p95(200MB)| multipart 上傳到 MinIO 寫完 | < 5s | 同上 |
|
||
| `POST /api/v1/jobs` p95(500MB)| 同上 | < 12s | 同上 |
|
||
| `POST /api/v1/jobs/:id/promote` p95 | 回應時間 95 百分位 | < 3s | 同上 |
|
||
| `GET /api/v1/jobs/:id/result` TTFB | First byte 時間 | < 500ms | Nginx access log + app log |
|
||
| `GET /api/v1/jobs/:id/result` 完整下載(50MB)| 完整 stream 結束時間 | < 2s @ 50MB/s 鏈路 | client 端量 |
|
||
| API key 驗證失敗率 | 401 / 總請求 | < 0.1%(同 caller 不該錯) | structured log |
|
||
|
||
---
|
||
|
||
## 2. 延遲預算
|
||
|
||
### 2.1 `POST /api/v1/jobs`(200MB 檔案)
|
||
|
||
| 階段 | 預算 | 備註 |
|
||
|------|------|------|
|
||
| Nginx ingress(含 multipart 前置)| 10ms | |
|
||
| API key constant-time compare | 1ms | 64 hex chars 比對 |
|
||
| Multer 接收(memory)| 4000ms | 200MB @ 50MB/s |
|
||
| Validation(欄位、mimetype、副檔名)| 20ms | |
|
||
| Upload concurrency semaphore | 0-1000ms | 高並發時可能等 |
|
||
| Redis 查 active_job | 10ms | |
|
||
| MinIO PutObject | 1000ms | 200MB @ 200MB/s |
|
||
| Redis 寫 job record + 索引(Lua)| 20ms | |
|
||
| Enqueue 到 Redis Stream | 10ms | |
|
||
| **總預算 (200MB)** | **~5s p95** | |
|
||
| **總預算 (500MB)** | **~12s p95** | 4 倍 multipart 時間 |
|
||
|
||
### 2.2 `GET /api/v1/jobs/:id`
|
||
|
||
| 階段 | 預算 |
|
||
|------|------|
|
||
| Nginx ingress | 5ms |
|
||
| API key compare | 1ms |
|
||
| Rate limiter check | 2ms |
|
||
| Redis GET job:{id} | 10ms |
|
||
| Client 隔離檢查 | 1ms |
|
||
| Status mapping + serialize | 5ms |
|
||
| ETag 計算 + compare | 2ms |
|
||
| **總預算** | **~30ms p50、< 200ms p95** |
|
||
|
||
### 2.3 `POST /api/v1/jobs/:id/promote`(單 target,50MB NEF)
|
||
|
||
| 階段 | 預算 |
|
||
|------|------|
|
||
| Nginx ingress | 5ms |
|
||
| API key compare | 1ms |
|
||
| Redis GET job | 10ms |
|
||
| 冪等性 check | 1ms |
|
||
| MinIO HEAD object | 50ms |
|
||
| OAuth token cache hit | 1ms |
|
||
| FAA PUT(50MB @ 100MB/s)| 500ms |
|
||
| Redis SET(markPromoted)| 10ms |
|
||
| **總預算** | **~600ms p50、< 3s p95** |
|
||
|
||
OAuth token cache miss 會多 200-500ms(取 token 一次)。
|
||
|
||
### 2.4 `GET /api/v1/jobs/:id/result`(Phase 0.8b 新增)
|
||
|
||
| 階段 | 預算 |
|
||
|------|------|
|
||
| Nginx ingress | 5ms |
|
||
| API key compare | 1ms |
|
||
| Rate limiter check | 2ms |
|
||
| Redis GET job | 10ms |
|
||
| Status / expires_at check | 1ms |
|
||
| MinIO GET stream init(含 HEAD)| 50ms |
|
||
| **TTFB** | **~70ms p50、< 500ms p95** |
|
||
| Stream NEF(50MB @ 50MB/s)| 1000ms(client 端量) |
|
||
|
||
**TTFB**:headers 送出 + 第一個 byte 到達 client,這是 `/result` 的關鍵 SLO。完整下載時間取決於檔案大小和鏈路頻寬,不算 Scheduler 的 SLO。
|
||
|
||
---
|
||
|
||
## 3. Token cache 策略
|
||
|
||
| Cache | TTL | 退出條件 |
|
||
|-------|-----|---------|
|
||
| ~~JWKS~~ | ~~10 min~~ | ~~遇到未知 kid 強制 refresh~~ | **Phase 0.8b 移除** |
|
||
| FAA service token(promote 用)| `expires_in - 60s` | 遇到 401 強制 refresh |
|
||
|
||
---
|
||
|
||
## 4. Rate Limit 策略
|
||
|
||
| 範圍 | 限制 | 動機 |
|
||
|------|------|------|
|
||
| 全局(IP)| 200 req / 15min | 防匿名流量 / DDoS(既有)|
|
||
| Per `client_id`(API key 模式下固定 `visionA-service`)| 300 req / 5min | 防 polling 暴衝 |
|
||
|
||
**Phase 0.8b 思考**:API key 模式下只有 1 個 caller、`client_id` 固定值,per-client rate limit 等於全局 limit。仍保留 per-client 結構,未來真有多 caller 時自動分流。
|
||
|
||
---
|
||
|
||
## 5. Streaming 記憶體足跡
|
||
|
||
### 5.1 `/result` stream(Phase 0.8b 新增)
|
||
|
||
NEF 50MB stream:
|
||
|
||
- MinIO client(aws-sdk):通常 16KB-64KB internal buffer
|
||
- Node HTTP response:highWaterMark 預設 16KB
|
||
- 整段 stream 期間 Scheduler heap 增量:**< 200KB per request**
|
||
|
||
→ 1000 並發 stream 估算 < 200MB heap,可接受。
|
||
|
||
### 5.2 `POST /api/v1/jobs` multipart memoryStorage(既有)
|
||
|
||
500MB multipart 寫入 memory:
|
||
|
||
- 單個 request:500MB peak heap
|
||
- 5 並發(`MAX_CONCURRENT_UPLOADS=5`):2.5GB peak
|
||
- 容器 RAM 應 ≥ 4GB
|
||
|
||
---
|
||
|
||
## 6. 觀測
|
||
|
||
詳見 `observability.md`。
|
||
|
||
每個 endpoint 必 log:
|
||
|
||
- `action`(如 `result.success`、`promote.success`)
|
||
- `request_id`
|
||
- `client_id`、`user_id`(如可取)
|
||
- `duration_ms`(從 handler start 到 response end)
|
||
- 失敗時:`error_code`、`error_name`(不 log error_message 內容、避免洩漏)
|
||
|
||
---
|
||
|
||
## 7. 負載測試計畫
|
||
|
||
| 類型 | 持續 | 目的 | 負載 |
|
||
|------|------|------|------|
|
||
| Steady-state | 30 min | 基線驗證 | 預估 QPS(visionA polling 5 user × 1 req/2s = 2.5 QPS)|
|
||
| Step-load | 每 5min 增量 | 找擴展極限 | 逐步增加到 50 QPS |
|
||
| Spike | 瞬間 | 突發流量 | 100 QPS(5x 基線)|
|
||
| Soak | 6 小時 | 記憶體洩漏 | 5 QPS 穩定 |
|
||
|
||
`/result` 特別測:
|
||
|
||
- 大檔 stream(200MB NEF)— OOM 測試
|
||
- 多並發 stream(20 個 client 同時下載)— 確認 Scheduler 不掛
|
||
- Slow client(client 收得慢)— 確認 stream 不堆 buffer
|