Auth pillar 從 OAuth 2.0 resource server 改成 pre-shared API key (visionA ↔ converter 1:1 internal trust)。新增 GET /api/v1/jobs/:id/result streaming endpoint 給 visionA backend 中轉 NEF 下載。 Phase A(auth 切換): - 新增 apiKeyMiddleware(constant-time compare、tokenFingerprint、4 audit events) - 砍 OAuth middleware + JWKS(保留 oauthClient 供 promote → FAA 使用) - 4 個 endpoint 換掛 requireApiKey - 加 TRUST_PROXY env + Express trust proxy 設定(forensic source_ip) Phase B(/result endpoint): - streaming NEF download with 5min timeout + concurrent cap 10 - Two-tier rate limit(burst 5/10s + sustained 20/min) - Bandwidth quota(1 GB/hr + 6 GB/24hr)by token_fingerprint - Range header silently ignored + Accept-Ranges: none - filename quote-escape + RFC 5987 fallback + sanitize - 8 個 /result audit events(forensic 完整) 設計演進記錄:docs/TODO-visionA-integration-v2.md(5/2 OAuth → 5/16 API key → 5/16 download via converter;對應 visionA repo ADR-015/016) Tests: 597 → 666 (+69)、29 suites all pass Security: APPROVE WITH CONDITIONS(單 instance 部署、6 新 env、24hr 監控) npm audit: 3 vuln → 0(transitive AWS SDK xml chain) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
307 lines
11 KiB
Markdown
307 lines
11 KiB
Markdown
# Infra 設計
|
||
|
||
> **狀態**:Phase 1 完工 — Phase 0.8b 只動 env,Nginx / docker-compose 結構不變。
|
||
>
|
||
> **配套**:`design-doc.md` §7、`auth.md` §4(CONVERTER_API_KEY 管理)。
|
||
|
||
---
|
||
|
||
## 1. Nginx 雙 vhost 分流
|
||
|
||
維持 Phase 1 設計(**Phase 0.8b 不動**):
|
||
|
||
- **public vhost**(443 對公網):只 proxy `/api/v1/*` + `/health`
|
||
- **internal vhost**(內部 IP 80):proxy `/jobs/*` + `/queues/stats` + Web UI
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ Nginx(單一 process) │
|
||
│ │
|
||
│ ┌────────────────────────┐ ┌────────────────────────────┐ │
|
||
│ │ server { │ │ server { │ │
|
||
│ │ listen 443 ssl; │ │ listen 10.0.0.1:80; │ │
|
||
│ │ server_name │ │ server_name │ │
|
||
│ │ converter....com; │ │ converter-internal...; │ │
|
||
│ │ │ │ │ │
|
||
│ │ location /api/v1/ {} │ │ location /jobs {} │ │
|
||
│ │ location = /health {} │ │ location /queues/stats {} │ │
|
||
│ │ location / { │ │ location / { │ │
|
||
│ │ return 404; │ │ proxy_pass web:3000; │ │
|
||
│ │ } │ │ } │ │
|
||
│ │ } │ │ } │ │
|
||
│ │ (public vhost) │ │ (internal vhost, 內網 IP) │ │
|
||
│ └───────────┬─────────────┘ └────────────┬────────────────┘ │
|
||
└──────────────┼──────────────────────────────┼───────────────────┘
|
||
│ │
|
||
▼ ▼
|
||
┌──────────────────────────────────────────────────┐
|
||
│ Task Scheduler (:4000) │
|
||
│ - /api/v1/* (API key 保護,僅 public vhost 轉入)│
|
||
│ - /jobs/* (無 auth,僅 internal vhost 轉入) │
|
||
│ - /jobs/*/events(SSE) │
|
||
│ - /health, /queues/stats │
|
||
└──────────────────────────────────────────────────┘
|
||
```
|
||
|
||
---
|
||
|
||
## 2. Nginx 完整設定(不變)
|
||
|
||
```nginx
|
||
# /etc/nginx/conf.d/converter.conf
|
||
|
||
upstream scheduler_upstream {
|
||
server scheduler:4000;
|
||
keepalive 32;
|
||
}
|
||
|
||
# Public vhost
|
||
server {
|
||
listen 443 ssl http2;
|
||
server_name converter.innovedus.com;
|
||
|
||
ssl_certificate /etc/nginx/certs/fullchain.pem;
|
||
ssl_certificate_key /etc/nginx/certs/privkey.pem;
|
||
|
||
location /api/v1/ {
|
||
proxy_pass http://scheduler_upstream;
|
||
proxy_set_header Host $host;
|
||
proxy_set_header X-Real-IP $remote_addr;
|
||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||
proxy_set_header X-Forwarded-Proto $scheme;
|
||
proxy_request_buffering off; # 大檔 stream
|
||
proxy_read_timeout 300s;
|
||
client_max_body_size 600M; # multipart 上限略大於 500MB
|
||
}
|
||
|
||
location = /health {
|
||
proxy_pass http://scheduler_upstream;
|
||
}
|
||
|
||
location / {
|
||
return 404 '{"error":{"code":"not_found","message":"Not found"}}';
|
||
default_type application/json;
|
||
}
|
||
}
|
||
|
||
# Internal vhost
|
||
server {
|
||
listen 10.0.0.1:80;
|
||
server_name converter-internal.innovedus.com;
|
||
|
||
location /jobs {
|
||
proxy_pass http://scheduler_upstream;
|
||
proxy_http_version 1.1;
|
||
proxy_set_header Host $host;
|
||
proxy_buffering off; # SSE 需要
|
||
}
|
||
|
||
location /queues/stats {
|
||
proxy_pass http://scheduler_upstream;
|
||
}
|
||
|
||
location / {
|
||
proxy_pass http://web:3000;
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 3. docker-compose.yml 環境變數變動
|
||
|
||
### 3.1 Phase 0.8b 移除
|
||
|
||
```yaml
|
||
# 對外 API auth 不再走 OAuth
|
||
- MEMBER_CENTER_ISSUER
|
||
- MEMBER_CENTER_JWKS_URL
|
||
- KNERON_CONVERTER_AUDIENCE
|
||
- JWKS_CACHE_MAX_AGE_MS
|
||
- JWKS_COOLDOWN_MS
|
||
- JWT_CLOCK_TOLERANCE_SEC
|
||
```
|
||
|
||
### 3.2 Phase 0.8b 新增
|
||
|
||
```yaml
|
||
- CONVERTER_API_KEY=${CONVERTER_API_KEY} # 64 hex chars from `openssl rand -hex 32`
|
||
```
|
||
|
||
### 3.3 保留不動(promote 需要)
|
||
|
||
```yaml
|
||
- MEMBER_CENTER_TOKEN_URL=${MEMBER_CENTER_TOKEN_URL}
|
||
- KNERON_CONVERTER_CLIENT_ID=${KNERON_CONVERTER_CLIENT_ID}
|
||
- KNERON_CONVERTER_CLIENT_SECRET=${KNERON_CONVERTER_CLIENT_SECRET}
|
||
- FILE_ACCESS_AGENT_BASE_URL=${FILE_ACCESS_AGENT_BASE_URL}
|
||
- FILE_ACCESS_AGENT_AUDIENCE=${FILE_ACCESS_AGENT_AUDIENCE}
|
||
- PROMOTE_TIMEOUT_MS=${PROMOTE_TIMEOUT_MS:-300000}
|
||
- OAUTH_TOKEN_REFRESH_SKEW_MS=${OAUTH_TOKEN_REFRESH_SKEW_MS:-60000}
|
||
- OAUTH_TOKEN_TIMEOUT_MS=${OAUTH_TOKEN_TIMEOUT_MS:-10000}
|
||
```
|
||
|
||
### 3.4 既有(不動)
|
||
|
||
```yaml
|
||
- PORT=4000
|
||
- NODE_ENV=${NODE_ENV:-development}
|
||
- REDIS_URL=${REDIS_URL}
|
||
- STORAGE_BACKEND=minio
|
||
- MINIO_*
|
||
- CONVERTER_TENANT_ID=${CONVERTER_TENANT_ID:-} # Phase 0.8b 仍保留(promote 流程仍可能用)
|
||
- API_V1_RATE_LIMIT_WINDOW_MS=${API_V1_RATE_LIMIT_WINDOW_MS:-300000}
|
||
- API_V1_RATE_LIMIT_MAX=${API_V1_RATE_LIMIT_MAX:-300}
|
||
- MULTIPART_MODEL_MAX_BYTES=${MULTIPART_MODEL_MAX_BYTES:-524288000}
|
||
- MULTIPART_REF_IMAGE_MAX_BYTES=${MULTIPART_REF_IMAGE_MAX_BYTES:-10485760}
|
||
- MULTIPART_REF_IMAGES_MAX_COUNT=${MULTIPART_REF_IMAGES_MAX_COUNT:-100}
|
||
- MAX_CONCURRENT_UPLOADS=${MAX_CONCURRENT_UPLOADS:-5}
|
||
- UPLOAD_RETRY_AFTER_SECONDS=${UPLOAD_RETRY_AFTER_SECONDS:-30}
|
||
```
|
||
|
||
### 3.5 變動移除原因
|
||
|
||
| Env | 為什麼移除 | Phase 1 用途 |
|
||
|-----|----------|-------------|
|
||
| `MEMBER_CENTER_ISSUER` | API key 不需要驗 issuer | OAuth resource server 驗 iss claim |
|
||
| `MEMBER_CENTER_JWKS_URL` | API key 不需要 JWKS | OAuth JWT 簽章驗證 |
|
||
| `KNERON_CONVERTER_AUDIENCE` | API key 不需要驗 aud | OAuth 驗 token 是給自己的 |
|
||
| `JWKS_*` | 沒有 JWKS cache 了 | JWKS 內部 cache 參數 |
|
||
| `JWT_CLOCK_TOLERANCE_SEC` | 沒有 JWT 驗證了 | JWT exp 時鐘容忍 |
|
||
|
||
---
|
||
|
||
## 4. `.env.example` 改動
|
||
|
||
### 4.1 移除段(OAuth resource server)
|
||
|
||
```bash
|
||
# === OAuth (Member Center) === ← 整段移除
|
||
MEMBER_CENTER_ISSUER=...
|
||
MEMBER_CENTER_JWKS_URL=...
|
||
|
||
# === Converter identity (Resource Server) === ← 整段移除
|
||
KNERON_CONVERTER_AUDIENCE=...
|
||
|
||
# === JWKS cache === ← 整段移除
|
||
JWKS_CACHE_MAX_AGE_MS=600000
|
||
JWKS_COOLDOWN_MS=30000
|
||
JWT_CLOCK_TOLERANCE_SEC=60
|
||
```
|
||
|
||
### 4.2 新增段
|
||
|
||
```bash
|
||
# === Phase 0.8b: API key for visionA → converter ===
|
||
# 用 `openssl rand -hex 32` 產 64 hex chars
|
||
# 雙端必須對齊:visionA `.env.stage` 的 VISIONA_CONVERTER_API_KEY 同值
|
||
# 絕不進 git / log / Slack
|
||
CONVERTER_API_KEY=
|
||
```
|
||
|
||
### 4.3 保留段(不變,promote 用)
|
||
|
||
```bash
|
||
# === Member Center token endpoint(converter → FAA promote 用)===
|
||
MEMBER_CENTER_TOKEN_URL=https://auth.innovedus.com/oauth/token
|
||
|
||
# === Converter identity (OAuth Client,promote 用) ===
|
||
KNERON_CONVERTER_CLIENT_ID=kneron_converter
|
||
KNERON_CONVERTER_CLIENT_SECRET=change-me
|
||
|
||
# === File Access Agent ===
|
||
FILE_ACCESS_AGENT_BASE_URL=https://files.nas.internal
|
||
FILE_ACCESS_AGENT_AUDIENCE=file_access_api
|
||
|
||
# === Promote / OAuth Client tunables ===
|
||
PROMOTE_TIMEOUT_MS=300000
|
||
OAUTH_TOKEN_REFRESH_SKEW_MS=60000
|
||
OAUTH_TOKEN_TIMEOUT_MS=10000
|
||
|
||
# === Rate Limit ===
|
||
API_V1_RATE_LIMIT_WINDOW_MS=300000
|
||
API_V1_RATE_LIMIT_MAX=300
|
||
|
||
# === Multipart upload ===
|
||
MULTIPART_MODEL_MAX_BYTES=524288000
|
||
MULTIPART_REF_IMAGE_MAX_BYTES=10485760
|
||
MULTIPART_REF_IMAGES_MAX_COUNT=100
|
||
MAX_CONCURRENT_UPLOADS=5
|
||
UPLOAD_RETRY_AFTER_SECONDS=30
|
||
```
|
||
|
||
---
|
||
|
||
## 5. 部署順序(Phase 0.8b)
|
||
|
||
**重要**:錯誤順序會讓 stage 整段 down。正確順序:
|
||
|
||
```
|
||
Step 1: converter 端先實作完 + deploy
|
||
- 砍 OAuth middleware、加 API key middleware
|
||
- 加 /result endpoint
|
||
- 設 CONVERTER_API_KEY env
|
||
- 此時 converter 對外只認 API key(OAuth 已移除)
|
||
- 但既有 visionA stage 還在用 OAuth → 會撞 401
|
||
⚠️ 此 Step 應在 visionA stage 跑得通 OAuth 之前先完成(既然 visionA OAuth 還沒整合通過、本來就 401)
|
||
|
||
Step 2: 驗證 converter 新 endpoint 可用
|
||
- curl 打 GET /api/v1/jobs/<某 completed job>/result 帶 Bearer <CONVERTER_API_KEY>
|
||
- 確認 200 + NEF binary stream
|
||
- curl 打 POST /api/v1/jobs 用同把 key
|
||
- 確認 201 + job_id
|
||
|
||
Step 3: visionA backend deploy(已 ready、commit 9e29ebf)
|
||
- VISIONA_CONVERTER_API_KEY env 跟 CONVERTER_API_KEY 對齊
|
||
- visionA 用 API key 打 converter、走新的 GetResult endpoint
|
||
|
||
Step 4: e2e 驗證
|
||
- User upload → init → poll → promote → download
|
||
- 全綠 = 完成
|
||
```
|
||
|
||
### 5.1 注意:5/9 stage 狀態
|
||
|
||
Phase 1 OAuth 從未在 stage 跑通(MC scope 沒註冊)。所以 Phase 0.8b 切換對「實際 e2e」是 **net positive**(從未 work → 開始 work)。Stage 不會有「OAuth 過了改 API key 變成 401」的 regression。
|
||
|
||
---
|
||
|
||
## 6. 安全配置
|
||
|
||
### 6.1 CONVERTER_API_KEY
|
||
|
||
詳見 `auth.md` §4。
|
||
|
||
重點:
|
||
- 每環境獨立(dev / stage / prod)
|
||
- 64 hex chars(`openssl rand -hex 32`)
|
||
- 雙端對齊(visionA + converter)
|
||
- 絕不進 git
|
||
- Rotation 流程:手動同步 .env + redeploy
|
||
|
||
### 6.2 Sec C1 暫緩(既有風險、不變)
|
||
|
||
`.env` 一度被 commit 進 git history(5/2 健檢發現),已加入 `.gitignore` 但 history 仍可追溯。
|
||
|
||
**Phase 0.8b 階段**:
|
||
- 新增 `CONVERTER_API_KEY` 時注意**不要進 git**
|
||
- Phase 1 ready 後做一次 git history rewrite + 全 secret rotate(包括新加的 CONVERTER_API_KEY、既有的 OAuth client_secret、MinIO 等)
|
||
|
||
---
|
||
|
||
## 7. CI/CD 影響
|
||
|
||
**無需改 CI**:
|
||
|
||
- 既有 GitHub Actions 設定不變
|
||
- 新加 `CONVERTER_API_KEY` 到 stage / prod secrets manager(Vault / k8s secret / docker secret)
|
||
- dev 用 `.env`(gitignored)
|
||
|
||
---
|
||
|
||
## 8. Phase 2 預留
|
||
|
||
- 多 instance 部署:rate limiter 需從 process-local memory 改 Redis store
|
||
- 多 caller:可考慮加回 OAuth resource server(API key + OAuth 並存模式)
|
||
- Secrets manager 自動 rotation:整合 HashiCorp Vault / AWS Secrets Manager
|