L 級新功能、PRD/Design/TDD/ADR 三方協作 + 互審 + M9-6 SDK 雙驗證、總計 ~9000 行文件。
範圍:
- A 階段(MVP、5 人天):KL520 + KL720 自動升級 KDP1 → KDP2
- B 階段(10.5 人天):手動降版面向一般使用者 + KL630 / KL730 擴展
- 合計 15.5 人天、安裝包 +7MB(保守 bundle 策略)
關鍵決策:
- 翻案 R5-Q9(progress.md 第二輪使用者決策「韌體燒錄 flash → B 砍掉」)
- 跨平台用 KneronPLUS Python C API、不用 DFUT.exe
- 多版本目錄結構選 C metadata(firmware/<chip>/{version}/ + CURRENT_VERSION)
- Kneron firmware redistribution 授權與 R5-B4 預置模型同性質、發佈前評估
文件產出:
- PRD v2.2(PRD-v2.md 495 行 + features/feature-firmware-management.md 599 行)
- Design v2.2(firmware-management.md 948 行 + control-panel.md §6a graceful shutdown)
- TDD v2.2(v2/firmware-management.md 823 行 + ADR-001 218 行)
- 8 份 research(含 M9-6 弱驗證 + 強驗證、~3200 行)
- 3 份三方互審報告(PM/Design/Architect cross-review)
M9-6 強驗證重大發現(影響 B 階段):
- KL730 product_id 實際是 0x732(不是 0x0730)
- KL630/KL730 firmware 是 embedded Linux rootfs(不是 .bin、不同代設計)
- KneronPLUS Python 沒 update_kdp_firmware_from_files 公開 API、warrenchen 走 ctypes
- 不影響 A 階段、B 階段 M9-8 需 spike
下一步:派 backend M9-1 起跑(bridge.py handle_firmware_upgrade)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
663 lines
26 KiB
Markdown
663 lines
26 KiB
Markdown
# Integration Plan:FW 偵測 + 升級 MVP(建議方案 A)
|
||
|
||
> 對應研究 plan §30
|
||
> 範圍:階段 A(自動升級 KDP1 → KDP2,KL520 + KL720)
|
||
> 不涵蓋階段 B(手動降版、KL630/KL730)— 等 A 驗證後再決定
|
||
|
||
---
|
||
|
||
## 1. Milestone 詳細拆解
|
||
|
||
### 整體依賴圖
|
||
|
||
```
|
||
M9-0 (decision) 使用者 review 此 plan → 同意 MVP 範圍
|
||
↓
|
||
M9-1 (bridge) bridge.py 新增 handle_firmware_upgrade + 補 fw_loader.bin
|
||
↓
|
||
M9-2 (driver) Go driver: UpgradeFirmware() method
|
||
↓
|
||
M9-3 (api) API handler + WebSocket room + DeviceInfo 欄位擴充
|
||
↓
|
||
M9-4 (ui) Frontend: FW badge + upgrade button + progress modal
|
||
↓
|
||
M9-5 (verify) 三平台實機驗證(macOS / Windows / Linux)
|
||
```
|
||
|
||
### M9-0:決策 + 文件先行(0.5 人天)
|
||
|
||
**負責**:Orchestrator 整合 PM + Architect + Design 三方意見、定稿 PRD / TDD / Design Spec 補丁。
|
||
|
||
**任務**:
|
||
1. 使用者 confirm MVP 範圍(KL520 + KL720、僅自動升級、不降版)
|
||
2. PM 補 `docs/autoflow/02-prd/PRD-v2.md` 新章節「§N. Firmware 管理(v2.2 補丁)」、明標 R5-Q9 翻案 + 範圍
|
||
3. Architect 補 `docs/autoflow/04-architecture/TDD-v2.md` 新子節 `v2/device-firmware.md`、新 ADR `adr/ADR-009-firmware-management.md`
|
||
4. Design 補 `docs/autoflow/03-design/` Devices 頁面 FW 卡片 wireframe + 升級 modal flow
|
||
5. 三方互審 → 使用者 confirm
|
||
|
||
**驗收**:所有共享文件三方互審通過、使用者 sign-off。
|
||
|
||
---
|
||
|
||
### M9-1:bridge.py 新增 firmware_upgrade handler(1 人天)
|
||
|
||
**負責**:Backend Agent
|
||
|
||
**依賴**:M9-0 完成
|
||
|
||
**任務**:
|
||
|
||
1. **複製 firmware 檔**:從 warrenchen repo 複製 `local_service_win/firmware/KL520/fw_loader.bin` → `server/scripts/firmware/KL520/fw_loader.bin`、commit 進 git
|
||
- 確認檔案大小、加進 `server/scripts/firmware/KL520/VERSION` 對照表(既有檔則加註)
|
||
|
||
2. **bridge.py 新增 handler**(pseudo-code、實作時 architect 會給更詳細的 stub):
|
||
|
||
```python
|
||
def handle_firmware_upgrade(params):
|
||
"""Upgrade KL520/KL720 to bundled KDP2 firmware.
|
||
|
||
Params:
|
||
port: int (USB port id, required)
|
||
chip: str ("KL520" or "KL720", required)
|
||
|
||
Returns:
|
||
{"status":"upgraded", "before_firmware":"KDP", "after_firmware":"KDP2",
|
||
"method":"kp_update_kdp_firmware_from_files", "duration_ms":28500}
|
||
or {"error":"...", "stage":"connect|loader|load_firmware|verify"}
|
||
"""
|
||
global _device_group, _firmware_loaded
|
||
|
||
chip = params.get("chip", "")
|
||
target_port = params.get("port", "")
|
||
|
||
if not HAS_KP:
|
||
return {"error": "kp module not available"}
|
||
|
||
# Stage 1: scan + find target
|
||
try:
|
||
descs = kp.core.scan_devices()
|
||
target_dev = None
|
||
for i in range(descs.device_descriptor_number):
|
||
dev = descs.device_descriptor_list[i]
|
||
if str(dev.usb_port_id) == str(target_port):
|
||
target_dev = dev
|
||
break
|
||
if target_dev is None:
|
||
return {"error": f"device on port {target_port} not found", "stage": "scan"}
|
||
|
||
detected_fw = str(target_dev.firmware)
|
||
_log(f"FW upgrade: detected={detected_fw}, chip={chip}, port={target_port}")
|
||
except Exception as e:
|
||
return {"error": str(e), "stage": "scan"}
|
||
|
||
# Stage 2: connect with magic pass
|
||
try:
|
||
_clear_device_group()
|
||
_device_group = kp.core.connect_devices_with_magic_pass(
|
||
usb_port_ids=[target_dev.usb_port_id],
|
||
magic=536173391 # KDP_MAGIC_CONNECTION_PASS
|
||
)
|
||
kp.core.set_timeout(_device_group, 60000)
|
||
except Exception as e:
|
||
return {"error": str(e), "stage": "connect"}
|
||
|
||
# Stage 3: 找 firmware paths
|
||
scpu_path, ncpu_path = _resolve_firmware_paths(chip)
|
||
loader_path = os.path.join(
|
||
os.path.dirname(os.path.abspath(__file__)),
|
||
"firmware", chip, "fw_loader.bin"
|
||
)
|
||
if not scpu_path or not ncpu_path:
|
||
return {"error": f"firmware files not found for {chip}", "stage": "resolve"}
|
||
|
||
# Stage 4: 升級
|
||
try:
|
||
start = time.time()
|
||
|
||
if "KDP" in detected_fw.upper() and "KDP2" not in detected_fw.upper():
|
||
# 舊 KDP1 → 先走 loader、再 load KDP2 to RAM
|
||
if not os.path.exists(loader_path):
|
||
return {"error": f"loader not found: {loader_path}", "stage": "loader"}
|
||
|
||
_log(f"Loading loader to switch to USB Boot: {loader_path}")
|
||
kp.core.update_kdp_firmware_from_files(
|
||
_device_group, loader_path, None, auto_reboot=True
|
||
)
|
||
# auto_reboot → device re-enumerate、 disconnect 會回非零、 容忍
|
||
time.sleep(2)
|
||
|
||
# 重新 scan + connect
|
||
_clear_device_group()
|
||
descs = kp.core.scan_devices()
|
||
# ... 找回 target_dev、 reconnect with magic
|
||
# 細節在實作時補
|
||
|
||
kp.core.set_timeout(_device_group, 60000)
|
||
kp.core.load_firmware_from_file(_device_group, scpu_path, ncpu_path)
|
||
else:
|
||
# 已是 KDP2 或 Loader、直接 load
|
||
kp.core.load_firmware_from_file(_device_group, scpu_path, ncpu_path)
|
||
|
||
_firmware_loaded = True
|
||
duration_ms = int((time.time() - start) * 1000)
|
||
except Exception as e:
|
||
return {"error": str(e), "stage": "upgrade"}
|
||
|
||
# Stage 5: verify
|
||
try:
|
||
# disconnect、 等待 USB stable、 rescan、 確認 firmware 已變
|
||
try:
|
||
kp.core.disconnect_devices(_device_group)
|
||
except Exception:
|
||
pass # auto_reboot 後 disconnect 失敗預期
|
||
_device_group = None
|
||
time.sleep(3)
|
||
|
||
descs = kp.core.scan_devices()
|
||
after_fw = "unknown"
|
||
for i in range(descs.device_descriptor_number):
|
||
dev = descs.device_descriptor_list[i]
|
||
if str(dev.usb_port_id) == str(target_port):
|
||
after_fw = str(dev.firmware)
|
||
break
|
||
|
||
return {
|
||
"status": "upgraded",
|
||
"before_firmware": detected_fw,
|
||
"after_firmware": after_fw,
|
||
"method": "kp_update_kdp_firmware_from_files",
|
||
"duration_ms": duration_ms,
|
||
}
|
||
except Exception as e:
|
||
return {"error": str(e), "stage": "verify"}
|
||
```
|
||
|
||
3. **main loop 加 dispatch**(L1166-1180):
|
||
```python
|
||
elif action == "firmware_upgrade":
|
||
result = handle_firmware_upgrade(cmd)
|
||
```
|
||
|
||
**驗收**:
|
||
- `python3 server/scripts/kneron_bridge.py` 手動測試(不接 device)→ ready 訊號正常
|
||
- 模擬 device-not-found 流程 → 回正確 error
|
||
- 實機接 KL520 KDP2 dongle 跑 `firmware_upgrade` → 應該 short-circuit(detected_fw 不是 KDP1、走 load_firmware 路徑)
|
||
- 實機接 KL520 KDP1(如果使用者有舊 dongle)→ 完整升級流程通過
|
||
|
||
---
|
||
|
||
### M9-2:Go driver UpgradeFirmware() method(1 人天)
|
||
|
||
**負責**:Backend Agent
|
||
|
||
**依賴**:M9-1 完成
|
||
|
||
**任務**:
|
||
|
||
1. **driver interface 擴充** `server/internal/driver/interface.go`:
|
||
```go
|
||
type DeviceDriver interface {
|
||
// ... 既有 methods ...
|
||
UpgradeFirmware(progressCh chan<- FirmwareProgress) error // 新
|
||
}
|
||
|
||
type FirmwareProgress struct {
|
||
Percent int `json:"percent"`
|
||
Stage string `json:"stage"` // "connecting" | "loading_loader" | "loading_firmware" | "verifying" | "done"
|
||
Message string `json:"message,omitempty"`
|
||
Error string `json:"error,omitempty"`
|
||
}
|
||
|
||
type DeviceInfo struct {
|
||
// ... 既有 ...
|
||
FirmwareIsLegacy bool `json:"firmwareIsLegacy,omitempty"`
|
||
FirmwareCanUpgrade bool `json:"firmwareCanUpgrade,omitempty"`
|
||
BundledFirmwareVer string `json:"bundledFirmwareVersion,omitempty"`
|
||
}
|
||
```
|
||
|
||
2. **`KneronDriver.UpgradeFirmware()`** 實作(位於 `kl720_driver.go`、雖然檔名歷史包袱、就先共用):
|
||
```go
|
||
func (d *KneronDriver) UpgradeFirmware(progressCh chan<- driver.FirmwareProgress) error {
|
||
d.mu.Lock()
|
||
d.info.Status = driver.StatusUpgrading // 新增 status
|
||
chip := d.chipType
|
||
port := d.info.Port
|
||
d.mu.Unlock()
|
||
|
||
// Disconnect existing connection first
|
||
d.Disconnect()
|
||
|
||
// Start a fresh Python bridge
|
||
if err := d.startPython(); err != nil {
|
||
return fmt.Errorf("start bridge: %w", err)
|
||
}
|
||
defer d.stopPython()
|
||
|
||
progressCh <- driver.FirmwareProgress{Percent: 5, Stage: "connecting"}
|
||
|
||
resp, err := d.sendCommand(map[string]interface{}{
|
||
"cmd": "firmware_upgrade",
|
||
"port": port,
|
||
"chip": chip,
|
||
})
|
||
if err != nil {
|
||
return fmt.Errorf("upgrade: %w", err)
|
||
}
|
||
|
||
// Push progress(bridge 一次性回傳結果、 progress 模擬)
|
||
progressCh <- driver.FirmwareProgress{Percent: 90, Stage: "verifying"}
|
||
|
||
beforeFw, _ := resp["before_firmware"].(string)
|
||
afterFw, _ := resp["after_firmware"].(string)
|
||
d.mu.Lock()
|
||
d.info.FirmwareVer = afterFw
|
||
d.info.Status = driver.StatusDetected // 升級後需要重新 connect
|
||
d.needsReset = true // 下次 connect 走完整 reset
|
||
d.mu.Unlock()
|
||
|
||
progressCh <- driver.FirmwareProgress{
|
||
Percent: 100,
|
||
Stage: "done",
|
||
Message: fmt.Sprintf("upgraded %s -> %s", beforeFw, afterFw),
|
||
}
|
||
return nil
|
||
}
|
||
```
|
||
|
||
3. **新增 driver status**:`StatusUpgrading DeviceStatus = "upgrading"`
|
||
|
||
4. **`Info()` 計算衍生欄位**:在 `Info()` 內根據 `FirmwareVer` 字串設定 `FirmwareIsLegacy`:
|
||
```go
|
||
func (d *KneronDriver) Info() driver.DeviceInfo {
|
||
d.mu.Lock()
|
||
defer d.mu.Unlock()
|
||
info := d.info
|
||
fw := strings.ToUpper(info.FirmwareVer)
|
||
info.FirmwareIsLegacy = strings.Contains(fw, "KDP") && !strings.Contains(fw, "KDP2")
|
||
info.FirmwareCanUpgrade = info.FirmwareIsLegacy && bundledFirmwareExists(d.chipType)
|
||
info.BundledFirmwareVer = readBundledFwVersion(d.chipType) // 從 firmware/<chip>/VERSION
|
||
return info
|
||
}
|
||
```
|
||
|
||
5. **新建 `server/internal/firmware/service.go`** 仿 `flash/service.go`:
|
||
```go
|
||
package firmware
|
||
|
||
type Service struct {
|
||
deviceMgr *device.Manager
|
||
tracker *ProgressTracker
|
||
}
|
||
|
||
func (s *Service) StartUpgrade(deviceID string) (string, <-chan driver.FirmwareProgress, error) {
|
||
session, _ := s.deviceMgr.GetDevice(deviceID)
|
||
// ... 仿 flash/service.go 的 goroutine + progressCh pattern ...
|
||
go func() {
|
||
err := session.Driver.UpgradeFirmware(task.ProgressCh)
|
||
if err != nil {
|
||
task.ProgressCh <- driver.FirmwareProgress{Percent: -1, Stage: "error", Error: err.Error()}
|
||
}
|
||
close(task.ProgressCh)
|
||
}()
|
||
return taskID, task.ProgressCh, nil
|
||
}
|
||
```
|
||
|
||
**驗收**:
|
||
- `go build ./...` PASS
|
||
- `go test ./server/internal/firmware/...` 有單元測試(mock driver)
|
||
- 與 M9-1 整合測試(手動接 device 跑完整 flow)
|
||
|
||
---
|
||
|
||
### M9-3:API handler + WebSocket + DeviceInfo 擴充(0.5 人天)
|
||
|
||
**負責**:Backend Agent
|
||
|
||
**依賴**:M9-2 完成
|
||
|
||
**任務**:
|
||
|
||
1. **新增 endpoint** `server/internal/api/handlers/device_handler.go`:
|
||
```go
|
||
func (h *DeviceHandler) UpgradeFirmware(c *gin.Context) {
|
||
id := c.Param("id")
|
||
taskID, progressCh, err := h.firmwareSvc.StartUpgrade(id)
|
||
if err != nil {
|
||
c.JSON(400, gin.H{
|
||
"success": false,
|
||
"error": gin.H{"code": "FW_UPGRADE_FAILED", "message": err.Error()},
|
||
})
|
||
return
|
||
}
|
||
|
||
go func() {
|
||
room := "firmware:" + id
|
||
for progress := range progressCh {
|
||
h.wsHub.BroadcastToRoom(room, progress)
|
||
}
|
||
h.firmwareSvc.CleanupTask(taskID)
|
||
}()
|
||
|
||
c.JSON(202, gin.H{"success": true, "data": gin.H{"taskId": taskID}})
|
||
}
|
||
```
|
||
|
||
2. **route 註冊**(router.go):
|
||
```go
|
||
devices.POST("/:id/firmware/upgrade", h.UpgradeFirmware)
|
||
```
|
||
|
||
3. **WebSocket room subscribe 已有 pattern**:客戶端 subscribe `firmware:<deviceId>`、跟既有 `flash:<deviceId>` 一致機制
|
||
|
||
4. **`DeviceInfo` 衍生欄位透過 `Info()` 自動回傳**(M9-2 已做)
|
||
|
||
**驗收**:
|
||
- `curl -X POST localhost:3721/api/devices/<id>/firmware/upgrade` 拿到 202 + taskID
|
||
- WebSocket 連到 `firmware:<id>` room 看到 progress event 流
|
||
- `GET /api/devices` 回傳已含新欄位
|
||
|
||
---
|
||
|
||
### M9-4:Frontend FW badge + 升級 UI(1.5 人天)
|
||
|
||
**負責**:Frontend Agent
|
||
|
||
**依賴**:M9-3 完成
|
||
|
||
**任務**:
|
||
|
||
1. **DeviceCard 元件** 新增 FW badge(位於 `frontend/src/components/devices/device-card.tsx`):
|
||
- 從 `device.firmwareVersion` + `firmwareIsLegacy` 算出 badge 顏色
|
||
- 紅:`firmwareIsLegacy = true`(KDP1 needs upgrade)
|
||
- 黃:含 `KDP2` 但版本字串不符合內建 `bundledFirmwareVersion`(未來功能、MVP 不細做)
|
||
- 綠:含 `KDP2`、且符合 bundled 版本
|
||
|
||
2. **升級按鈕**:
|
||
- `firmwareCanUpgrade = true` 時顯示
|
||
- 點擊 → 開升級 modal
|
||
|
||
3. **升級 modal**:
|
||
- 顯示警告:「升級期間請勿拔除裝置 / 預估 30-60 秒」
|
||
- 確認按鈕 → 呼叫 `POST /api/devices/:id/firmware/upgrade`
|
||
- subscribe WebSocket `firmware:<id>` room
|
||
- progress bar + 階段提示(連線中…/載入 loader.../載入 firmware.../驗證.../完成)
|
||
- 完成 → toast 通知「升級成功」+ 自動 rescan devices
|
||
- 失敗 → 顯示 error + 提示 re-plug device
|
||
|
||
4. **store 變更** `frontend/src/lib/store/devices-store.ts`(推測位置):
|
||
- 加 `firmwareUpgradeProgress` state(key: deviceId、value: FirmwareProgress)
|
||
- 加 `subscribeFirmwareProgress(deviceId)` action
|
||
- 升級完成後自動 `rescan()`
|
||
|
||
5. **i18n 新增 keys**(`frontend/src/lib/i18n/{zh-TW,en}.ts`):
|
||
- `devices.firmware.upgrade.button` — 「升級韌體」/ "Upgrade Firmware"
|
||
- `devices.firmware.upgrade.modal.title` — 「韌體升級」/ "Firmware Upgrade"
|
||
- `devices.firmware.upgrade.modal.warning` — 「升級期間請勿拔除裝置...」/ ...
|
||
- `devices.firmware.upgrade.stage.connecting` — 「連線中...」/ "Connecting..."
|
||
- `devices.firmware.upgrade.stage.loadingLoader` — 「載入引導程式...」/ "Loading bootloader..."
|
||
- `devices.firmware.upgrade.stage.loadingFirmware` — 「載入韌體...」/ "Loading firmware..."
|
||
- `devices.firmware.upgrade.stage.verifying` — 「驗證中...」/ "Verifying..."
|
||
- `devices.firmware.upgrade.stage.done` — 「完成」/ "Done"
|
||
- `devices.firmware.upgrade.error.generic` — 「升級失敗、請重新插拔裝置後再試」/ ...
|
||
- `devices.firmware.badge.legacy` — 「需要升級」/ "Update Required"
|
||
- `devices.firmware.badge.outdated` — 「版本較舊」/ "Outdated"
|
||
- `devices.firmware.badge.uptodate` — 「最新」/ "Up to Date"
|
||
|
||
**驗收**:
|
||
- `pnpm --dir frontend build` PASS
|
||
- 手動測試(mock device):badge 顯示正確、modal 流程順暢
|
||
- 真機測試(M9-5 整合)
|
||
|
||
---
|
||
|
||
### M9-5:三平台實機驗證(1 人天)
|
||
|
||
**負責**:Testing Agent
|
||
|
||
**依賴**:M9-4 完成
|
||
|
||
**任務**:
|
||
|
||
1. **macOS(Intel + Rosetta)**:
|
||
- 接 KL520(KDP1,如果使用者有舊 dongle):完整升級流程
|
||
- 接 KL520(KDP2,目前狀態):「無升級需要」UI 路徑
|
||
- 接 KL720:類似測試(如果有 legacy KL720)
|
||
|
||
2. **Windows**:
|
||
- 同上、特別注意 WinUSB driver 綁定狀態
|
||
- 升級期間 USB re-enumerate 不會 hang HTTP 連線
|
||
|
||
3. **Linux**(Ubuntu 22.04/24.04):
|
||
- 同上、特別注意 udev rules
|
||
|
||
4. **異常路徑**:
|
||
- 升級期間拔除 device → error 訊息合理、UI 可復原
|
||
- 升級期間關 app → server graceful shutdown + 升級中斷不留壞狀態
|
||
- 升級超時 → timeout 訊息合理、device 仍可用
|
||
|
||
5. **回歸測試**:
|
||
- 既有功能(model load / inference)不受影響
|
||
- 升級完成後立刻能正常 inference
|
||
|
||
**驗收**:
|
||
- 三平台 smoke test 報告(每個平台至少一張 dongle 跑完整 happy path + 一個異常路徑)
|
||
- 無回歸 bug
|
||
|
||
---
|
||
|
||
## 2. 涉及檔案清單
|
||
|
||
### 2.1 新建檔案
|
||
|
||
| 路徑 | 用途 | 預估行數 |
|
||
|------|------|--------|
|
||
| `server/scripts/firmware/KL520/fw_loader.bin` | KL520 USB Boot Loader binary(從 warrenchen 複製)| ~10KB |
|
||
| `server/scripts/firmware/KL520/VERSION` | bundled FW 版本紀錄(如果還沒有)| ~5 行 |
|
||
| `server/scripts/firmware/KL720/VERSION` | 同上 | ~5 行 |
|
||
| `server/internal/firmware/service.go` | FW 升級 service | ~150 |
|
||
| `server/internal/firmware/progress.go` | FW progress tracker | ~50 |
|
||
| `frontend/src/components/devices/firmware-badge.tsx` | FW badge 元件 | ~80 |
|
||
| `frontend/src/components/devices/firmware-upgrade-modal.tsx` | 升級 modal | ~200 |
|
||
| `docs/autoflow/04-architecture/v2/device-firmware.md` | TDD 新子節 | ~300 |
|
||
| `docs/autoflow/04-architecture/adr/ADR-009-firmware-management.md` | ADR 紀錄 Q9 翻案 + 設計決策 | ~200 |
|
||
|
||
### 2.2 修改檔案
|
||
|
||
| 路徑 | 改什麼 | 影響範圍 |
|
||
|------|--------|---------|
|
||
| `server/scripts/kneron_bridge.py` | +`handle_firmware_upgrade()` + main loop dispatch + import time | +~100 行 |
|
||
| `server/internal/driver/interface.go` | +`UpgradeFirmware` method、+`FirmwareProgress` struct、+`StatusUpgrading`、+`DeviceInfo` 3 個新欄位 | +~30 行 |
|
||
| `server/internal/driver/kneron/kl720_driver.go` | +`UpgradeFirmware()` method、+`Info()` 計算衍生欄位、+`bundledFirmwareExists()` / `readBundledFwVersion()` helper | +~80 行 |
|
||
| `server/internal/api/handlers/device_handler.go` | +`UpgradeFirmware` handler + DI `firmwareSvc` | +~30 行 |
|
||
| `server/internal/api/router.go` | +route 註冊 | +1 行 |
|
||
| `server/cmd/main.go` 或 server init | DI 注入 firmware.Service | +~5 行 |
|
||
| `frontend/src/components/devices/device-card.tsx` | 嵌入 FirmwareBadge + 升級按鈕 | +~30 行 |
|
||
| `frontend/src/lib/store/devices-store.ts` | +`firmwareUpgradeProgress` state + subscribe action | +~50 行 |
|
||
| `frontend/src/lib/i18n/zh-TW.ts` | +11 個新 keys | +~15 行 |
|
||
| `frontend/src/lib/i18n/en.ts` | 同上 | +~15 行 |
|
||
| `docs/autoflow/02-prd/PRD-v2.md` | +§N 韌體管理章節 | +~80 行 |
|
||
| `docs/autoflow/04-architecture/TDD-v2.md` | +§2.10 子檔索引、+§3 風險清單條目 | +~20 行 |
|
||
| `installer/{macos,windows,linux}/*` | 確認 firmware bundle 進 installer payload(既有應已涵蓋) | 0 行(驗證) |
|
||
|
||
---
|
||
|
||
## 3. 與 TDD v2.1 / Design v2.1 / PRD v2.1 受影響章節清單
|
||
|
||
### PRD v2.1 受影響章節(建議補丁)
|
||
|
||
| 章節 | 改動 |
|
||
|------|------|
|
||
| §2 變更摘要(0.0)| 加一列「v2.1 → v2.2:新增 Firmware 管理」|
|
||
| §N 韌體管理(新章節) | 主要內容、含 Q9 翻案聲明、MVP 範圍、商業背景 |
|
||
| §11 懸念列表 | + N-R5「FW 升級失敗後使用者復原流程」 |
|
||
| 變更紀錄 | 加一列 |
|
||
|
||
### TDD v2.1 受影響章節(建議補丁)
|
||
|
||
| 章節 | 改動 |
|
||
|------|------|
|
||
| §0.0 v2.0 → v2.1 差異速覽 | 加一列「Firmware 管理新增 M9 milestone series」|
|
||
| §2 子檔案地圖 | + 一列 `v2/device-firmware.md` |
|
||
| §3 風險清單 | + R-v2-8「FW 升級中拔除裝置」+ R-v2-9「升級後 USB re-enumerate race」|
|
||
| §0.1 v1 → v2 差異速覽 | 加一列「FW 管理路徑:R5-Q9 翻案」|
|
||
| 新增 `v2/device-firmware.md` | 完整新文件 |
|
||
| `v2/server-lifecycle.md` | 不動(FW 升級獨立於 server lifecycle) |
|
||
| `v2/milestone-plan.md` | + M9 系列(M9-0 ~ M9-5)|
|
||
|
||
### Design v2.1 受影響章節(建議補丁)
|
||
|
||
| 檔案 | 改動 |
|
||
|------|------|
|
||
| `docs/autoflow/03-design/v2/*` 既有 | 不動 |
|
||
| 新增 `docs/autoflow/03-design/v2/device-firmware-ui.md` | FW badge 設計 + 升級 modal wireframe + i18n table |
|
||
| 中英雙語文案表 | + 11 個新 keys |
|
||
|
||
### 新 ADR
|
||
|
||
`docs/autoflow/04-architecture/adr/ADR-009-firmware-management.md`:
|
||
- 狀態:Proposed → 待使用者 sign-off
|
||
- 背景:使用者要求加 FW 升級、Q9 砍 flash 決策翻案
|
||
- 決策:用 KneronPLUS Python API(`kp.core.update_kdp_firmware_from_files`)、不引入 DFUT.exe
|
||
- 替代方案:(a) DFUT.exe 拒因 Windows-only;(b) ctypes 拒因 KneronPLUS Python API 已足夠;(c) 不做拒因真實痛點
|
||
- 後果:+1 個新模組 + ~5 人天工時 + 0KB 安裝包衝擊
|
||
|
||
---
|
||
|
||
## 4. 風險清單
|
||
|
||
### R-FW-1:升級中拔除 device(中度風險)
|
||
|
||
**情境**:使用者在 firmware load 階段拔掉 USB。
|
||
|
||
**影響**:
|
||
- KL520:不寫 flash、最壞情況「沒升級成功」、re-plug 後仍是舊狀態、不 brick
|
||
- KL720:可能正在寫 flash、有 brick 風險
|
||
|
||
**緩解**:
|
||
1. UI modal 警告「升級期間請勿拔除裝置」
|
||
2. KL720 升級路徑加更明確的「不可中斷」提示
|
||
3. 失敗時自動 rescan + 提示 re-plug
|
||
4. 文件記錄:如果使用者真的拔了 KL720、可以走 warrenchen 的 DFUT.exe 救磚(不打包到 visionA-local、僅內部 SOP)
|
||
|
||
### R-FW-2:升級後 USB re-enumerate race(低度風險)
|
||
|
||
**情境**:升級成功 → device disconnect 回非零 → re-enumerate 中 → 立刻 reconnect 可能拿到舊 handle。
|
||
|
||
**緩解**:
|
||
- bridge.py handler 內 `time.sleep(3)` 等 USB 穩定
|
||
- 不在 handler 內 reconnect、回傳成功後讓 Go 端執行 rescan + 重新 GetDevice
|
||
|
||
### R-FW-3:KL520 升級不需要、但被誤觸發(低度風險)
|
||
|
||
**情境**:使用者 dongle 已是 KDP2、但 UI 顯示「升級」按鈕(badge 計算錯誤)。
|
||
|
||
**緩解**:
|
||
- bridge.py handler 內 detect_firmware 後若已是 KDP2、走 short-circuit 路徑(只 load_firmware to RAM、不寫 flash、< 5 秒完成)
|
||
- driver `FirmwareIsLegacy` 判定要嚴格(必須是 `KDP` 但不含 `KDP2`)
|
||
|
||
### R-FW-4:跨晶片混淆(中度風險)
|
||
|
||
**情境**:bridge.py 升級時用錯 firmware 檔(拿 KL520 firmware 燒到 KL720)。
|
||
|
||
**緩解**:
|
||
- handler 強制要求 `chip` 參數、`_resolve_firmware_paths(chip)` 不接受空字串
|
||
- driver 自己 `d.chipType` 已從 USB pid 判定、傳給 bridge 不靠前端
|
||
- 加 unit test 確認 KL520 升級時不會用到 KL720 路徑
|
||
|
||
### R-FW-5:簽章 / 法律風險(待釐清)
|
||
|
||
**情境**:把 Kneron 官方 firmware(fw_scpu.bin / fw_ncpu.bin / fw_loader.bin)打包進我們的 installer、是否合法?
|
||
|
||
**緩解**:
|
||
- 我們已 bundle KDP2 firmware 4 個月、Q9 砍的是「使用者主動燒」、不是「打包 firmware」
|
||
- B4 已有「Kneron 預置模型 re-distribution 授權」未解決問題(progress.md L792)、firmware 同性質
|
||
- **建議**:發佈前跟 Kneron 確認、與 model bundle 一起處理
|
||
|
||
### R-FW-6:HTTP timeout(低度風險)
|
||
|
||
**情境**:升級 KL720 需 ~180s、HTTP keep-alive / proxy timeout 可能更短。
|
||
|
||
**緩解**:
|
||
- API 設計:HTTP 立刻回 202、實際進度走 WebSocket(已採此 pattern)
|
||
- WebSocket 心跳設定 < 60s
|
||
|
||
### R-FW-7:Windows admin 權限(中度風險)
|
||
|
||
**情境**:`kp.core.install_driver_for_windows` 需 admin、若 WinUSB driver 未綁定、升級會在第一步失敗。
|
||
|
||
**緩解**:
|
||
- 既有 visionA-local 已有 driver 安裝邏輯(M1+ TODO)
|
||
- 升級 handler 偵測到 driver 未綁時、明確錯誤訊息引導使用者
|
||
- 不在 visionA-local 內自動裝 driver、提示使用者重跑 installer
|
||
|
||
---
|
||
|
||
## 5. 與既有架構衝突點(細項)
|
||
|
||
### 5.1 既有 `Flash()` method 不重用
|
||
|
||
`Flash()` 是 load model、不混進 firmware 升級。確保 plan 內**不**動 `flash/service.go`。
|
||
|
||
### 5.2 `kl720_driver.go` 檔名包袱
|
||
|
||
`UpgradeFirmware()` 方法放在這個檔內、但**不**改檔名(範圍外)。新建 `firmware/` package 與 `flash/` 並列、不疊在 `kneron/` 內。
|
||
|
||
### 5.3 既有 `restartBridge()` 不擴充
|
||
|
||
`restartBridge()` 是 KL520 換 model 時用、跟 firmware 升級不同流程。**不**擴充它、新流程獨立。
|
||
|
||
### 5.4 既有 `needsReset` flag 要善用
|
||
|
||
升級完成後、driver 應設 `needsReset=true`、下次 connect 走完整 reset flow(既有邏輯)。確保升級後第一次 inference 不踩 Error 15 SEND_DATA_TOO_LARGE。
|
||
|
||
### 5.5 既有 WS rooms 命名規範
|
||
|
||
既有 `flash:<id>` / `inference:<id>` → 新增 `firmware:<id>`、命名一致。
|
||
|
||
### 5.6 既有 `DeviceInfo.Status` 列舉
|
||
|
||
新增 `StatusUpgrading`、跟既有 `StatusConnecting / StatusFlashing / StatusInferencing` 並列。前端要新增對應 status badge。
|
||
|
||
### 5.7 既有 watchServer Error state 機制
|
||
|
||
FW 升級失敗**不**升級為 server Error state(device-level 失敗)、僅 device 進 `StatusError`。確保 watchServer goroutine 不會把 FW 升級 timeout 誤判為 server 死掉。
|
||
|
||
---
|
||
|
||
## 6. 階段 A 完成後的階段 B 評估提示(不在 MVP 範圍)
|
||
|
||
階段 A 驗證後、使用者可以再決定是否做階段 B。階段 B 候選:
|
||
|
||
1. **手動降版 KDP2 → KDP1**(給開發者測試用)
|
||
- 需要從 warrenchen 複製 `firmware/KL520_kdp/`(~80KB)
|
||
- bridge.py 加 `handle_firmware_downgrade` handler
|
||
- 暴露在 Settings > 進階面板(不在 Devices 頁主要 UI)
|
||
- 預估 1.5 人天
|
||
|
||
2. **加 KL630 / KL730 支援**
|
||
- 需先擴 driver / bridge.py 認 product_id 0x0630 / 0x0730、能 load firmware
|
||
- 需驗 KneronPLUS SDK 對應版本是否支援這兩個晶片
|
||
- 預估 3-4 人天(driver 擴展 + FW 升級擴展 + 三平台驗證)
|
||
|
||
3. **多版本 firmware 並存**
|
||
- `server/scripts/firmware/<chip>/<version>/fw_*.bin`
|
||
- 前端 dropdown 讓使用者選版本
|
||
- 預估 1.5 人天
|
||
|
||
合計階段 B 約 6-7 人天、加總 MVP 5 人天、完整版約 11-12 人天。
|
||
|
||
---
|
||
|
||
## 7. 給 Orchestrator 的下一步建議
|
||
|
||
1. **本份研究 plan 給使用者 review**
|
||
2. 使用者同意 MVP(方案 A)後:
|
||
- 啟動 **PM Agent** 補 PRD v2.2 韌體管理章節
|
||
- **Architect Agent** 自己補 TDD v2.1 §2.10 + 寫 ADR-009
|
||
- 啟動 **Design Agent** 補 Devices 頁面 FW UI 規格
|
||
3. 三方互審 + 使用者 sign-off
|
||
4. 開發進入 M9-1 ~ M9-5 流程(依本檔 §1)
|
||
5. 階段 B 暫不啟動、等階段 A 驗證後再評估
|