jim800121chen 4d0b870480 feat(visionA-backend): DB 接入 — 6 store 接 PostgreSQL/Redis 持久化(塊 0-5)
把 visionA-backend 6 個 in-memory store 接到資料庫持久化,範圍=完整
(PG 全接 + session 接 Redis + 交易韌性)。interface / handler 不動,
只加 DB 實作 + 換 wiring,config 未設 DB 時保留 in-memory fallback。

- 塊 0 基礎建設:pgx/v5 連線池 + DatabaseConfig/RedisConfig + golang-migrate
  runner(embed)+ cmd/migrate + testcontainers 測試基礎建設
- 塊 1 model → Postgres:array 映射、upsert 保留 CreatedAt、faa_object_key、
  三維 filter(owner/chip/source)、soft-delete partial index
- 塊 2 device → Postgres:partial unique(已刪 serial 可重註冊)、雙狀態欄位
- 塊 3 token → Postgres:pairing_tokens + session_tokens 分表、token_hash 當 PK
- 塊 4 userSession → Redis:idle + absolute 雙 TTL 取代 cleanup goroutine
  (tunnel session 維持 in-memory,yamux handle 不可序列化)
- 塊 5 交易/韌性:WithTx helper + 刪 device cascade 撤銷 token(同 tx 原子)
  + /healthz ping PG/Redis(fail-fast 503)+ pgx error 統一映射(不洩漏 raw error)

降級策略(fail-fast):PG 掉 → 持久資料 API 回 503;Redis 掉 → session 失敗
不自動 fallback in-memory(避免多機 session 不同步)。

DB:PostgreSQL 14.23(gen_random_uuid 內建、無 citext → email 用 lower() unique
index)。每塊經 Reviewer 審查 + 真 PG/Redis testcontainers 全量 dbtest 綠燈,
in-memory fallback 未受影響。

docs: 同步更新 database.md(schema/config/migration 清單)+ api-spec.md
(409/503 錯誤碼、/healthz 新行為、device unpair cascade)。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 18:28:04 +08:00

354 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# API Spec — 對前端的 REST + WebSocket 端點
> **base URL**`https://api.visiona.cloud`Phase 1/ `http://localhost:3001`(雛形)
> **認證**`Authorization: Bearer <JWT>`(雛形可省略,走 `StaticAuthService`
> **通用回應格式**
> ```json
> { "success": true, "data": {...} }
> { "success": false, "error": { "code": "ERR_CODE", "message": "..." } }
> ```
---
## 1. Auth雛形 stub
### POST `/api/auth/login`
- 雛形:回 `501 { code: "NOT_IMPLEMENTED" }`
- Phase 1`{ email, password }``{ user, access_token, refresh_token }`
### POST `/api/auth/register`
- 同上
### POST `/api/auth/logout`
- Phase 1清 refresh token
### GET `/api/auth/me`
- 雛形:回 `demo-user` hard-coded
- Phase 1從 JWT 取
---
## 2. Pairing
### POST `/api/pairing/token`
- Auth required雛形靜默通過
- 雛形 Response
```json
{ "success": false, "error": { "code": "NOT_IMPLEMENTED", "message": "Dev uses env VISIONA_PAIRING_TOKEN" } }
```
- Phase 1 Response
```json
{
"success": true,
"data": {
"token": "pk_AbCd1234...",
"expires_at": "2026-04-21T13:00:00Z"
}
}
```
### GET `/api/pairing/status`
- 查詢當前 user 的 tunnel 連線狀態
- Response:
```json
{
"success": true,
"data": {
"connected": true,
"connected_at": "2026-04-21T12:00:00Z",
"last_seen_at": "2026-04-21T12:34:56Z",
"device_id": "dev-xxx",
"agent_version": "local-tool 1.2.3"
}
}
```
### GET `/api/pairing/tokens`
- List 當前 user 的所有 tokens
- Phase 1回 array of `{ id, device_id, kind, created_at, last_seen_at }`
### DELETE `/api/pairing/tokens/:id`
- 撤銷指定 token
- Phase 1 實作;雛形 501
---
## 3. Devices
以下大部分端點**會被轉發到 local agent**。api-server 行為:
1. 檢查 user 有 tunnel 連線
2. 若 device_id 有傳,檢查 ownership
3. 透過 tunnel forward 請求到 local agent沿用 POC `handleProxy`
4. 回傳 local agent 的 response
路徑與回應格式**與 local-tool 相同**,前端改 base URL 即可。
### GET `/api/devices` — 列出當前本地掃到的裝置
### POST `/api/devices/scan` — 觸發重掃
### GET `/api/devices/:id` — 單一裝置
### POST `/api/devices/:id/connect`
### POST `/api/devices/:id/disconnect`
### POST `/api/devices/:id/flash` — 燒韌體(透過 tunnel
### POST `/api/devices/:id/inference/start`
### POST `/api/devices/:id/inference/stop`
**雲端特有(非 tunnel forward**
### GET `/api/cloud/devices` — 列出「我在雲端綁過的 Device records」
- 與 `GET /api/devices` 不同:這個是查雲端 DB不問 local agent
- 雛形:從 `InMemoryDeviceRepository` 回
- Response`[{ id, name, device_type, serial_number, status, last_seen_at }]`
### POST `/api/cloud/devices/:id/rename`
- 改雲端上的 device name
### DELETE `/api/cloud/devices/:id` — 解除綁定unpair並刪除雲端 device record
- **DB 接入後行為(塊 5**:刪除 device 會在**同一交易內** cascade 撤銷該 device 的所有 pairing token + session token`pairing_tokens` + `session_tokens` 兩張表by `device_id``UPDATE ... SET revoked_at = now()`)。對應 DB 層一致性定義見 `../database.md` §6。
- 撤銷後該 device 的既有 tunnel session 將無法續用,需重新配對。
---
## 4. Models
### GET `/api/models` — 列出 user 的 model
- 雲端模型(存 storage+ preset models硬編碼
- Response
```json
{
"success": true,
"data": [
{
"id": "abc-123",
"name": "YOLOv5 Face",
"target_chip": "kl520",
"file_size": 12345678,
"source": "uploaded",
"created_at": "..."
}
]
}
```
### GET `/api/models/:id`
- Model 詳情
### POST `/api/models/init` — 初始化上傳
- Request: `{ name, file_size, checksum, target_chip, description? }`
- Response:
```json
{
"success": true,
"data": {
"model_id": "new-id",
"upload_url": "https://...presigned-put-url...",
"upload_expires_at": "..."
}
}
```
### POST `/api/models/:id/finalize`
- 在 presigned PUT 成功後呼叫
- api-server 驗證檔案已存在、size / checksum 對 → status 改 "ready"
### DELETE `/api/models/:id`
### POST `/api/models/:id/load-to-device`
- Body`{ device_id }`
- api-server 產 presigned GET URL → 透過 tunnel 送 local agent 「下載並載入」
- 回傳 job status
---
## 5. Clusters從 POC 搬)
### GET `/api/clusters`
### POST `/api/clusters`
- Body: `{ name, device_ids: [...] }`
### GET `/api/clusters/:id`
### DELETE `/api/clusters/:id`
### POST `/api/clusters/:id/devices`
### DELETE `/api/clusters/:id/devices/:deviceId`
### PUT `/api/clusters/:id/devices/:deviceId/weight`
### POST `/api/clusters/:id/flash`
### POST `/api/clusters/:id/inference/start`
### POST `/api/clusters/:id/inference/stop`
---
## 6. Camera / Media
與 local-tool 相同,全部透過 tunnel forward
### GET `/api/camera/list`
### POST `/api/camera/start`
### POST `/api/camera/stop`
### GET `/api/camera/stream` — MJPEG透過 tunnel streaming
### POST `/api/media/upload/image`
### POST `/api/media/upload/video`
### POST `/api/media/upload/batch-images`
### GET `/api/media/batch-images/:index`
### POST `/api/media/seek`
---
## 7. System
### GET `/api/system/health`
- 雲端側:回 api-server 自己的健康 + tunnel 連線狀態
```json
{
"success": true,
"data": {
"api_server": "ok",
"tunnel_connected": true,
"agent_last_seen_at": "..."
}
}
```
### GET `/api/system/info`
- 版本資訊
### GET `/healthz` — liveness / readiness給 load balancer
- 純基礎設施健康檢查端點(非 `/api/*` 前綴),供 LB / orchestrator probe。
- **DB 接入後行為(塊 5**PostgreSQL / Redis **啟用時會 ping**,任一 ping 失敗 → 回 **503**(讓 load balancer 知道此實例不健康、停止導流)。**未啟用的依賴略過檢查**(雛形未配 DB/Redis 時,這些依賴視為 not-applicable不影響健康判定
- Response健康`200 { "status": "ok" }`
- Response不健康`503 { "status": "unavailable", "failed": ["postgres"] }``failed` 列出 ping 失敗的依賴)
- 與 `GET /api/system/health` 的差異:`/healthz` 是基礎設施 probe含 DB/Redis ping`/api/system/health` 是業務層健康api-server 自身 + tunnel 連線狀態)。
---
## 8. Converter
### 8.1 Phase 1 stub既有保留
> 雛形 stub 路由Phase 0.8 的真實整合改走 §8.2 `/api/conversion/*`,下列路由保留為 placeholder 待 Phase 1 視需要 supersede。
#### POST `/api/converter/jobs`
- Body`{ source_model_key, target_chip, params? }`
- Response`{ job_id, status: "queued" }`
#### GET `/api/converter/jobs`
- List user 的 jobs
#### GET `/api/converter/jobs/:id`
- Job 狀態
#### GET `/api/converter/jobs/:id/download`
- 下載產物presigned URL redirect
**詳細契約** → [`api-converter-contract.md`](api-converter-contract.md)
### 8.2 Phase 0.8 — `/api/conversion/*`(轉檔功能整合)
正式對接 kneron_model_converter scheduler + FAA delegated download
- `POST /api/conversion/init` — multipart streaming proxy 到 converter建 job
- `GET /api/conversion/{job_id}` — 查狀態HTTP pollingfrontend 間隔 2s
- `POST /api/conversion/{job_id}/promote-to-models` — 「加到模型庫」
- `POST /api/conversion/{job_id}/download-token` — 換 browser 直連 FAA 的 delegated URL
- `GET /api/conversion/active` — 查當前 user 是否有 active job
**詳細契約** → [`api-conversion.md`](api-conversion.md)
**內部設計** → [`../conversion.md`](../conversion.md)
**ADR** → [`../adr/adr-014-conversion-integration.md`](../adr/adr-014-conversion-integration.md)
---
## 9. WebSocket
### WS `/ws/devices/events`
- 訂閱「裝置上下線」事件
- Server push
```json
{ "type": "device.connected", "device_id": "xxx", "at": "..." }
{ "type": "device.disconnected", "device_id": "xxx", "at": "..." }
```
### WS `/ws/devices/:id/flash-progress`
- 燒錄進度(透過 tunnel 從 local agent 取)
### WS `/ws/devices/:id/inference`
- 推論結果串流
### WS `/ws/server-logs`
- log broadcast沿用 local-tool 的 broadcaster
### WS `/ws/system`
- 系統事件server:shutdown-imminent 等)
### WS `/ws/clusters/:id/inference`
### WS `/ws/clusters/:id/flash-progress`
### WS `/ws/pairing/status`(新)
- 訂閱 tunnel 連線狀態變化
- Server push
```json
{ "type": "tunnel.connected", "connected_at": "..." }
{ "type": "tunnel.disconnected", "reason": "network_error", "at": "..." }
```
---
## 10. Storage雛形 LocalFS 代理)
### GET `/storage/*filepath?expires=...&signature=...`
- LocalFS 的假 presigned GET
- 驗簽後讀檔回傳
### PUT `/storage/*filepath?expires=...&signature=...`
- LocalFS 的假 presigned PUT
- 驗簽後收 body 寫檔
**Phase 1**:直接由 S3 提供,不走 api-server。
---
## 11. 錯誤碼清單
| Code | HTTP | 說明 |
|------|------|------|
| `UNAUTHORIZED` | 401 | 未認證或 token 無效 |
| `FORBIDDEN` | 403 | 權限不足 |
| `NOT_FOUND` | 404 | 資源不存在 |
| `VALIDATION_FAILED` | 400 | 輸入驗證失敗 |
| `CONFLICT` | 409 | 唯一性衝突(如重複註冊 active device serial、email 已存在。DB unique violation 映射到此碼。|
| `TUNNEL_DISCONNECTED` | 502 | Local agent 未連線 |
| `TUNNEL_ERROR` | 502 | Tunnel 傳輸錯誤 |
| `NOT_IMPLEMENTED` | 501 | 雛形尚未實作 |
| `RATE_LIMITED` | 429 | 請求過快Phase 1|
| `INTERNAL_ERROR` | 500 | 未預期錯誤 |
| `SERVICE_UNAVAILABLE` | 503 | 後端依賴PostgreSQL / Redis連線失敗時的 fail-fast。持久資料相關 APImodel / device / token在 PG 不可用時回此碼,不回假資料。|
> **DB 接入後的降級策略fail-fast2026-06-20 使用者拍板)**
> - **PostgreSQL 掉** → 持久資料相關 APImodel / device / token回 `503 SERVICE_UNAVAILABLE`**不回假資料、不 fallback in-memory**(避免回傳過期/不一致資料)。
> - **Redis 掉** → session 驗證失敗(請求視為未認證 → `401 UNAUTHORIZED`**不自動 fallback in-memory session**(避免多機部署下各實例 session 不同步)。
> - DB unique violation → `409 CONFLICT`(而非 500讓前端能區分「衝突」與「未預期錯誤」。
---
## 12. Pagination
對會變大的 listmodels、devices、jobs用 cursor-based
```
GET /api/models?limit=50&cursor=...
Response:
{ "data": [...], "next_cursor": "..." | null }
```
雛形可先簡單回全部in-memoryPhase 1 接 DB 時實作 cursor。
---
**雛形 MVP 清單**(必須有):
- `GET /api/system/health`
- `GET /api/pairing/status`
- `GET /api/devices` + 透過 tunnel forward
- `GET /api/models` + `POST /api/models/init` + `/finalize`LocalFS
- `/storage/*` 代理
- WS `/ws/devices/events`
- WS `/ws/pairing/status`
其他可以先 501 或 stub。